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INVERTIBILITY OF SPARSE NON-HERMITIAN MATRICES 


ANIRBAN BASAK* * AND MARK RUDELSON 1 " 


Abstract. We consider a class of sparse random matrices of the form A n = (£i,j<5»,j)£j=ij where 
fej} are i-i.d. centered random variables, and are i.i.d. Bernoulli random variables taking 

value 1 with probability p n , and prove a quantitative estimate on the smallest singular value for 
p n = f2(l2£’l) ) under a suitable assumption on the spectral norm of the matrices. This establishes 
the invertibility of a large class of sparse matrices. For p n = fi(n~ a ) with some a £ (0,1), we deduce 
that the condition number of A„ is of order n with probability tending to one under the optimal 
moment assumption on This in particular, extends a conjecture of von Neumann about the 

condition number to sparse random matrices with heavy-tailed entries. In the case that the random 
variables are i.i.d. sub-Gaussian, we further show that a sparse random matrix is singular 

with probability at most exp(— cnp n ) whenever p n is above the critical threshold p n = D( 1 °^” ). 
The results also extend to the case when {fii.j } have a non-zero mean. We further find quantitative 
estimates on the smallest singular value of the adjacency matrix of a directed Erdos-Reyni graph 
whenever its edge connectivity probability is above the critical threshold fl( lo ^" ). 


1. Introduction 


This paper establishes the bounds on the condition number of a sparse random matrix with inde¬ 
pendent identically distributed (i.i.d.) entries and on the probability that such matrix is singular. 

For a n x n real matrix A n its singular values Sk(A n ),k = 1,2,... ,n, are the eigenvalues of 
\A n \ = sjA* n A n arranged in non-increasing order. The maximum and the minimum singular values 
are often of particular interest, and they can be defined as 

Smax( j/ fn) • — Sf(A n ) .— SUp ||A n x|| 2 , •S m i n (A n ) • — s n (A n ) .— inf 11 A n x 11 2 , 

zeS"- 1 x&s™- 1 

where S n ~ 1 := {x € M n : ||x ||2 = 1} and ||-||2 denotes the Euclidean norm of a vector. This definition 
means that the largest singular value s max (A n ) is the operator or spectral norm of the matrix A n , 
and the smallest singular value s m i n (A n ) provides a quantitative measure of the invertibility of A n : 


Smin(^n) = inf {\\A n - B || : det(B) = 0} , 


where || A n — B || denotes the operator norm of the n x n matrix A n — B. Another such measure is 
the condition number defined as 


a(A n ) := 


c {A n ) 


^min {A 11) 

which often serves a measure of stability of matrix algorithms in numerical linear algebra. 

In this paper we obtain lower bounds on the smallest singular value of a class of sparse random 
matrices, and then finding appropriate upper bounds on the maximum singular value, we deduce 
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that the condition number of such matrices are well controlled, and therefore they are well invertible 
(see Theorem 1.1, Corollary 1.5 and Corollary 1.8). 

Another class of random matrices which are of interest in combinatorics and graph theory are the 
adjacency matrices of random graphs. Graphs, more precisely, their edges can be either undirected 
or directed. Both directed and undirected graphs are abundant in real life. One of the simplest, 
and widely studied model in the undirected random graph literature is the Erdos-Reyni random 
graph. Here we consider the directed version of that model (see Definition 1.10), and show that 
the smallest singular value and condition number of the adjacency matrix of such random graphs 
are well controlled (see Theorem 1.11). 

Analysis of extremal singular values of random matrices of large but fixed dimensions has received 
a lot of interest in recent years, due to its application in compressed sensing, geometric functional 
analysis, theoretical computer science, and other fields of science. Moreover, the bounds on the 
extreme singular values, especially the one on the smallest singular value, play a key role in obtaining 
the limiting spectral distribution of various non-Hermitian random matrix ensembles. For example, 
see [3, 6, 10, 11, 18, 24, 30, 35]. Likewise, the bounds on the smallest singular value obtained here 
play a crucial role in establishing the circular law for sparse non-Herimitian random matrices, which 
is derived in a companion paper [4] (see also Remark 1.2). 

The study of the smallest singular value of a random matrix was initiated back in 1940’s when 
von Neumann and his collaborators used random matrices to test their algorithm for the inversion 
of large matrices, and they speculated that 

(1.1) Smin(^n) ~ n _1/2 , s max (A n ) ~ n 1/2 with high probability 

(see [33, pp. 14, 477, 555] and [34, Section 7.8]). That is, 

(1.2) c(A n ) ~ n with high probability. 

A more precise version of this conjecture appeared in [27]. For Gaussian random matrices it was 
proved that 

lP(smin(^) < en -1 / 2 ) ~ e, for every e € (0,1), 

see [9, 28]. However, the conjecture about the smallest singular value of a general random matrix 
remained open for a long time. For example, the result was not known even for random sign matrix, 
i.e. for the matrix with i.i.d ±1 symmetric random variables. The first bound in this direction was 
proved in [19] for matrices with i.i.d. sub-Gaussian entries. Later in [21] this was improved to prove 
lower bound on s m i n under the finiteness of the fourth moment assumption. In particular, it was 
shown that for every 5 > 0, there exists e > 0 such that 

P(smin(Aj) < en~ 1/2 ) < 5. 

Restricting to the i.i.d. sub-Gaussian entries their arguments also give the following strong proba¬ 
bility bound: 

(1.3) T(smin(7ln) < en -1 / 2 ) < Ce + c n , for every e > 0, 

where C and c £ (0,1) are some constants depending polynomially on the sub-Gaussian moment 
of the entries. Finally a matching upper bound was proved for sub-Gaussian entries in [22], and 
improved under finite fourth moment assumption in [31]. The necessary bounds on the largest 
singular value follows from [13] for entries with finite fourth moment, and from [8] for i.i.d. sub- 
Gaussian entries. This establishes (1.1)-(1.2) for random matrices with centered i.i.d. entries of 
unit variance with finite fourth moment. 

Another line of research is directed towards proving the universality of the smallest singular value 
under small perturbation. This is largely motivated by its application in establishing the circular 
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law. Considering a random matrix of i.i.d. entries with finite second moment Tao and Vu in [29] 
established that for every C' > 0 there exists a C > 0 such that 

(1.4) P(s min (^ n + M n ) < n~ c ) < n~ G ', 

where M n is a deterministic n x n matrix with s max (M n ) = n 0<yl \ 

The results described so far are only for dense matrices. However, sparse matrices are more 
abundant in statistics, neural network, financial modeling, electrical engineering, wireless commu¬ 
nications, and in many other fields. We refer the reader to [1, Chapter 7] for other examples, 
and their relevant references. It is therefore natural to ask if there is an analogue of (1.1)-(1.2) for 
sparse matrices. Analysis of sparse matrices is usually more challenging than its dense counterparts 
because of presence of a large number of zeros. Litvak and Rivasplata in [17] considered a class 
of random sparse matrices. They imposed certain conditions on the columns and rows of those 
matrices which prevent a large number of zeros, and then under the finiteness of (2 + e) moments 
they show (1.1)-(1.2) hold. 

Another way to construct sparse random matrix is to multiply each of the entries by i.i.d. 
Bernoulli entries denoted below by Ber(p n ), where p n —» 0. For such matrices it was shown in [29] 
that (1.4) holds (a similar result appeared in [10]), as long as p n = f l(n~ a ) for some a € (0,1) 
(Recall that a n = Ll(b n ) iff \\m.\i\i n ^ 00 a n /b n > K for some K > 0). In [10], under a minimal 
moment assumption, it was also shown that s max (A n ) < n^/pf with probability tending to 1. This 
implies that cr(A n ) = 0(n c ), for a large constant C, which is weaker than the conjecture (1.2) for 
these sparse matrices. 

On the other hand, it is straightforward to check that when p n < pp, the probability of the 
matrix containing a zero row is positive and bounded below uniformly in n, thereby making it 
singular. Thus the analogue of (1.1) cannot be extended beyond the pp barrier. Therefore it 
would be interesting to check if analogue of (1.1)-(1.2) hold for all p n = f)(l2£II). 

In our first result below we provide an affirmative answer to the question above, under a suitable 
assumption on the maximal singular value. Note that it only requires the finiteness of the fourth 
moment. In the theorem below we consider a slightly different set-up, where we allow the entries 
on the diagonal to be arbitrary as long as they are not too big. This generalization is motivated by 
its role in the analysis of the adjacency matrix of a random directed graph as well as in the proof of 
the circular law (see Remark 1.2 for more details). The case of matrices with i.i.d. entries follows 
by conditioning on the diagonal entries, and showing that, with high probability, they satisfy the 
requirements of the main theorem. This is established in Corollary 1.5 and Corollary 1.8. Before 
stating the main theorem, for ease of writing, let us introduce the notation [n] := {1, 2,... , n}. 


Theorem 1.1. Let A n be an n x n matrix with zero on the diagonal and has i.i.d. off-diagonal 
entries aij = where <pj, i,j € [n\,i j ; are independent Bernoulli random variables taking 

value 1 with probability p n € (0,1]. and fi.j, i,j € [n],i j are i.i.d. random variables with zero 


mean, unit variance, and finite fourth moment. 


Fix K > 1, 


and let LIk '■= j ||A n || < Ky/np n j. 


Further let D n be a real non-random diagonal matrix with ||T) n ]| < R^/np n , for some positive 
constant R. Then there exist constants 0 < c\ p \ < oo, depending on K,R, and on 

the fourth moment of such that for any e > 0, and 


(1.5) 


Pn > 


gl.llogn 

n 


( 1 . 6 ) 



^min 


(An + D n ) < C^jEexp 


(_ . lQg(l/Pn) \ 

V CL1 log (np n ) J 



< e + exp(-c' 1 -j npn). 
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Remark 1.2. In Theorem 1.1 we studied the smallest singular value of A n + D n instead of con¬ 
sidering A n , the matrix with i.i.d. entries. Since in directed Erdos-Reyni graphs, we do not allow 
self-loops, the diagonal entries of its adjacency matrix are zero. This has motivated us to consider 
A n in Theorem 1.1 with zeros on the diagonal. Addition of an extra diagonal matrix D n to A n 
has been motivated by its application in identifying the limiting spectral distribution of A n . It is 
well known that in order to establish the convergence of empirical distribution of the eigenvalues 
of A n , one needs to prove the convergence of the integral of log(-) with respect to the empirical 
distribution of the singular values of A n /y/np — uil n for Lebesgue a.e. ui € C (for more details see 
[7]). Whenever, the limiting distribution is compactly supported, one can restrict w in a ball in the 
complex plane. 

Since log(-) is unbounded near 0, one must have a control on s m i n (-)- Set D n = uiy/npl n + A n , 
where A n is the diagonal matrix consisting of the diagonal entries of A n . in Theorem 1.1. Upon 
showing that ||A n || = 0(y/np n ) with high probability, we have the required estimate on s m i n (-) for 
all bounded real oj (recall that in Theorem 1.1 we need D n to be a matrix with real entries). The 
difficulty for complex c a arises because of an e-net argument. See Remark 3.10 and Remark 4.5 for 
more details. 

In [4] we overcome this difficulty and extend Theorem 1.1 for complex uj. Since such extension 
requires a significant additional work, we defer it to [ ] where it is applied to proving the circular 
law for such matrices. 

Remark 1.3. We prove Theorem 1.1 under the assumption of unit variance, and finite fourth 
moment of This assumption can easily be relaxed to unit variance, and bounded (2 + 

rj )-th moment, for any rj > 0. The boundedness of fourth moment is required in the proof of 
Lemma 3.5, where it has been used to apply Paley-Zygmund inequality. However, Paley-Zygmund 
inequality continues to hold as long as the (2 + ry)-th moment is finite (see [16, Lemma 3.5]). To 
apply this version of the Paley-Zygmund inequality in the proof of Lemma 3.5 we need to bound 
E[| Xa=i ®i x i\ 2+r, ]i where {#i}i£[ n ] are symmetrized versions of {£i}ig[n] 5 and x € S n ~ l . To this 
end, one can use [15, Theorem 6.20] to obtain the necessary bounds. Finiteness of fourth moment 
has also been used in Proposition 4.2. Since [25, Assumption 1.4] holds under the unit variance, 
bounded (2+r/)-th moment, one can instead use [25, Corollary 7.6] to arrive at the same conclusion. 
For the clarity of presentation, we work with the finite fourth moment assumption. 


To obtain the necessary estimates on the spectral norm in Theorem 1.1, we first focus on heavy¬ 
tailed random variables, and establish the required bound when p n = D(n~ a ). For dense matrices, 
the finiteness of the fourth moment is sufficient (and also necessary) to guarantee the necessary 
bounds on s m ax( - ) (see [13]). However, for sparse case one needs finiteness of the higher moments 
depending on the choice of a. This is established in the second part of the next theorem. Before 
stating this theorem let us recall that a n ~ b n means that there exists positive constants c, C such 
that cb n < a n < Cb n for all large n. 

Theorem 1.4. Fix a € (0,1) and let p n = Ll{n~ a ). Denote 

2(2 — a 

?:= T^ 

Let A n be an n x n random matrix with i.i.d. entries aij = Sij&j, where Sij are Bernoulli random 
variables with P(<5 U - = 1) = p n , and are independent copies of a centered random variable f of 
unit variance and finite fourth moment. 
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(i) //E|£| 9 < K q , for some K < oo, then, for any r < q, there exists some positive constant C 
depending on a,K,r, and the fourth moment of f, such that 

E|| AX<cV^) r - 

(ii) Let p n ~ n~ a . For any r < q, there exist p,,u > 0, depending on r and q, and a centered 
random variable £ with E|£| r < K, and E|£| 9 = oo, such that 

E (11 A n 11 < n u y/np n ) < exp(-Cn M ), 
where C is an absolute constant. 


Recall that A n is the diagonal matrix consisting of the diagonal entries of A n . Thus denoting 
A n := A n — A n , we see that it satisfies the conditions of Theorem 1.1. To apply Theorem 1.1 for A n 
we also need to establish that ||A n || = 0(y/np n ) with large probability. This can be done similarly 
as in Theorem 1.4 (see proof of Corollary 1.5). Moreover, when p n = fl(n _ “) we have 


log(l /Pn) 
log (npn) 


0 ( 1 ). 


Therefore we obtain the following corollary. 


Corollary 1.5. Let A n be an nxn matrix with i.i.d. entries a tJ = where 5ij, i,j £ [n] are 

independent Bernoulli random variables taking value 1 with probability p n and i,j € [n] are 
i.i.d. centered random variables with variance at least one, and finite fourth moment. Let {-D n } ne pj 
be a sequence of real diagonal matrices such that \\D n \\ < R^Jnp n for all n, and for some R < oo. 
Assume that 

p n = H(n~ a ), for some a € (0,1) and E|£jj| 9 < oo, where q = -4-—--. 


Then for every 5 > 0, there exists an e > 0 and no, depending on R, a, 6, and q-th moment of |Cj|, 
such that 


(1.7) 


'Smin(^4n, T Dn) 



< 5 for all n > hq. 


Now note that combining Theorem 1.4, and Corollary 1.5 we immediately deduce that, for any 
5 > 0, there exists Ko, and no depending on 5, a, and the q-th moment of |£,;j|, such that 

P(cr(A ri ) < Ko'n) > 1 — 6 for all n > no, 

validating (1.2) for heavy-tailed sparse random matrices. Assertion (ii) of Theorem 1.4 shows that 
the moment condition El^jjl 9 < oo is optimal. 

Next we consider sparse matrices with a lighter tail. To this end, recall the definition of sub- 
Gaussian random variables. 

Definition 1.6. For a random variable £, the sub-Gaussian norm of £, denoted by ||£||^ 2 , is defined 
as 

II£IU : = sup k~ 1/2 ||£|| fc , 

k> 1 

where for every k € N, ||£|| fc := (El^l^) 1 /^. If the sub-Gaussian norm is finite, the random variable 
£ is called sub-Gaussian. 
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This definition of the norm ||-||^ 2 is equivalent to the cannonical one, see [20]. 

We now state our result about spectral norm of sparse random matrices with sub-Gaussian 
entries. 

Theorem 1.7. There exists Cq > 1 such that the following holds. Let n € N and p n €E (0,1] be 
such that p n > Co^fp. Let A n be an n x n random matrix with i.i.d. entries aij = dijfij, where 
Sij are Bernoulli random variables with = 1) = p n and fij are centered sub-Gaussian random 
variables. Then there exist positive constants C i j,ci depending on the sub-Gaussian norm of 
{fij}, so that 

P(||Ai|| > Ci Yy/npff) < exp(—ci/jnp n ). 

Proceeding as in Theorem 1.7 we can also show that ||A n || = 0(yjnp n ) with large probability. 
Therefore we obtain the following corollary. 

Corollary 1.8. Let A n be an nxn matrix with i.i.d. entries ajj = Si,j£i,j, where Sij, i,j £ [n] are 
independent Bernoulli random variables taking value 1 with probability p n and £j j, i, j € [n] are 
i.i.d. centered sub-Gaussian random variables with variance at least one. Let {D n } be a sequence 
of real diagonal matrices such that ||Z) n || < R^/np n for all n, and for some R < oo. Then there 
exist constants 0 < g, g, g, g < oo, depending on R, and the sub-Gaussian norm of 
such that for 

C L glogn 

Pn > -, 

n 

and any e > 0, 

(1.8) P (s min (A n + D n ) < Ci .geexp - e + exp(-c 1 _gnp n ). 


Since p n = we have 

log(l/ Pn) = Q f log n \ 

log (np n ) V log log n J 

Thus we deduce that for all p n = for every e > 0, 

P (a(A n ) > C'e _1 ra 1+I °s 1 °s"^ < e + exp(— cnp n ) 

where c,C are some constants, depending only on the sub-Gaussian norm of f t .j. This validates 

c 

(1.2) upto a factor of n lo s lo g n . 

Also, letting e —>• 0, we obtain the optimal bound for the probability that the sparse random 
matrix is singular: 

( log 77, \ 

-j. 

Remark 1.9. In Theorem 1.7 we considered only sub-Gaussian random variables. One can consider 
a more general class of light tailed random variables. Namely, we can consider random variables £ 
such that 

(1.9) < C h h^ h , for all h > 1, and for some constants C and (3. 

Note that (3=1/2 yields the sub-Gaussian random variables. Considering sparse random matrices 
with i.i.d. copies of £ satisfying (1.9) for (3 > 1/2, one can show that ||A n || = 0(y/np n ), for all p n 
satisfying np n = fl((logn) 2/3 ). For an outline of the proof see Remark 6.3. 
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We now extend our result for the adjacency matrix of directed Erdos-Reyni random graph. Let 
us begin with the relevant definitions. 


Definition 1.10. Let G n be a random directed graph on n vertices, with vertex set [n\, such that 
for every i / j, a directed edge from i to j is present with probability p, independently of everything 
else. Assume that the graph G n is simple, i.e. no self-loops and multiple edges are present. We call 
this graph G n a directed Erdos-Reyni graph with edge connectivity probability p. For any such graph 
G n we denote Adj n := Adj(G n ) to be its adjacency matrix. That is, for any i,j £ [n], 


MinCbj) 


1 if a directed edge from i to j is present in G n 
0 otherwise. 


We now have the following theorem on the smallest singular value of the adjacency matrix of a 
directed Erdos-Reyni graph. 


Theorem 1.11. Let Adj n be the adjacency matrix of a directed Erdos-Reyni graph, with edge 
connectivity probability p n £ (0,1). Fix R > 1, and let D n be a non-random real valued diagonal 
matrix with \\D n \\ < R^Jnp n . Then there exist constants 0 < c\ n,ci \\,C i n < oo, 

depending only on R, such that for 

Cl.ll lo S n / „ / C xl i\ogn 

_ Pn 1 

n n 

and any e > 0, 

(LiO) P ^s min (Adj n + D n ) < C x n eexp ^-c i n <£ + exp (-c i n np n ). 


In Theorem 1.1 the entries of the matrix under consderation have zero mean. So we cannot apply 
those results directly to prove Theorem 1.11. We extend Theorem 1.1 for non-centered random 
variables (see Theorem 7.1) which yields the desired result for the directed Erdos-Reyni graphs. 

Outline of the paper. 

• In Section 2, we introduce the necessary concepts and provide an outline of the proof 
of Theorem 1.1. The proof is based on decomposing the unit sphere into compressible, 
dominated, and incompressible vectors, and controlling the infimum of || (A n + D n )x|| 9 for 
each these three parts. 

• The main result in Section 3 is a lower bound of the infimum over compressible and domi¬ 
nated vectors (see Proposition 3.1). The idea of splitting the sphere into compressible (close 
to sparse) and incompressible ones originated in [16] and was further developed in [19, 21]. 
Yet, for sparse random matrices, it can be implemented only for vectors with a relatively 
large support. To treat the vectors with a very small support, we had to introduce a new 
class, namely dominated vectors. Handling these vectors requires a new technique based on 
sparsity of the matrix. 

First we prove a concentration result in Lemma 3.2. Using this lemma, we derive a lower 
bound for the infimum of ||(A n + H r j)x|| 2 for 0(p - ^-compressible and dominated vectors 
in Lemma 3.3, and Lemma 3.4. To deal with cn-compressible and dominated vectors, we 
first derive a result in Corollary 3.7, using which we prove the desired for dominated vectors 
in Lemma 3.8, and then we finally prove Proposition 3.1. Before concluding the section we 
point out that the techniques in this section allow us to consider D n with complex entries, 
in Proposition 3.1 (see Remark 3.10). 
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• In Section 4 we prove a result about the infimum for vectors with small LCD (see Proposition 
4.1). Before proving this result, we first recall few preliminary facts about LCD. Unlike in 
Section 3, it does not extend for D n with complex entries (see Remark 4.5). 

• In Section 5, we combine results from Section 3, and Section 4, and complete the proof of 
Theorem 1.1. 

• In Section 6, we prove Theorem 1.4 and Theorem 1.7 establishing the necessary estimates 
on spectral norm for sparse random matrices with heavy tail and sub-Gaussian random 
variables. Combining these results with Theorem 1.1, we prove Corollary 1.5 and Corollary 
1.8. Finally in Remark 6.3 we outline an extension of Theorem 1.7 for random variables 
satisfying (1.9). 

• Section 7 is devoted to the proof of Theorem 1.11. We begin with extending Theorem 
1.1, to matrices with non-centered random entries (see Theorem 7.1). To handle random 
variables with non-zero mean we need a folding trick, which we explain in detail. The rest of 
the proof of Theorem 7.1 largely follows from that of Theorem 1.1. We provide a detailed 
outline about how to extend the results of Section 3 and Section 4 to this more general 
setup. Finally we show that Theorem 7.1 can be appropriately adapted to prove Theorem 
1 . 11 . 


2. Preliminaries and Proof Outline 

Without loss of generality, we may assume that p n < c(K+R)~ 2 , for some small positive constant 
c, since for larger values of p n , the entries ajj have variance bounded below by an absolute constant. 
In such case, Theorem 1.1 follows from [21]. 

Since 

Smin(-A n T Dn) — inf 11 ( A n T Z7 n )x| I„ , 
xeS "- 1 

to prove Theorem 1.1, we need to find a lower bound on this infimum. For dense matrices this is 
done via decomposing the unit sphere into compressible and incompressible vectors, and obtaining 
necessary bounds on the infimum on both of these parts separately (cf. [19, 21, 23, 32]). To carry 
out the argument for sparse matrices we introduce another class of vectors which we call dominated 
vectors. Below we define the necessary concepts, and explain the necessity of the dominated vectors 
along with a outline of the proof. 

We start with the definition of compressible and incompressible vectors. 

Definition 2.1. Fix m < n. The set of m-sparse vectors is given by 

Sparse{m ) := {x G M n | \supp(x)\ < m}, 

where |£| denotes the cardinality of a set S. Furthermore, for any 5 > 0, the vectors which are 
5-close to m-sparse vectors in Euclidean norm, are called (m, 5)-compressible vectors. The set of 
all such vectors, hereafter will be denoted by Comp(m,6). Thus, 

Comp(m, 5) := {x € S n ^ 1 | By € Sparse(m) such that ||x — y|| 2 < <5}. 

The vectors in S'” -1 which are not compressible, are defined to be incompressible, and the set of all 
incompressible vectors is denoted as Incomp(m,5). 

Next we define the dominated vectors. These are also close to sparse vectors, but in a different 


sense. 
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Definition 2.2. For any x € S 71-1 , let ir x : [n] —>• [n] be a permutation which arranges the absolute 
values of the coordinates of x in an non-increasing order. For 1 < m < m! < n, denote by 
x [m:m'] € the vector with coordinates 

x [m:m'](j) x (j) ' 1 [m:m'] (Tc (j) ) • 

In other words, we include in the coordinates of x which take places from m to m! in the 

non-increasing rearrangement. 

For a < 1 and m < n define the set of vectors with dominated tail as follows: 

Dom(m,a) := {x € S n ~ l | ||a;[ m+ i:„]|| 2 < ay/m ||z[m+i:n] L}- 

Note that by definition, Sparse(m)nS ' n ~ 1 C Dom(m,a), since for m-sparse vectors, X[ m +i :n ] = 0. 
We now provide an outline of the proof. For the ease of writing, hereafter, we will often drop the 
sub-script in p n , and will write p instead. 

The proof of Theorem 1.1 proceeds by first bounding the infimum over compressible and dominated 
vectors, and then the same for the incompressible vectors. As in [ 21 ], the first step is to control 
the infimum of ||A n x || 2 for sparse vectors (for clarity of explanation we take D n = 0 in rest of the 
section). This was done in [21] using a small ball probability estimate, and an e-net argument (see 
[21, Corollary 2.7] and [21, Proposition 2.5]). However, the sparseness of the entries prevents us to 
use these techniques here. For example, adapting [21, Proposition 2.5] to the sparse set-up one can 
at best hope to obtain that for any fixed x € S k -1 

P (j|A n x|| 2 < ^ e~ cnp , 

for a tall nx k matrix sparse matrix A n , and for some 77 , c > 0. However, when one tries to use the 
e-net argument, then it is clear we must have k = 0{np). Since in the sparse regime p —>• 0, this is 
not enough. Moreover to uplift the result for tall matrices to square matrices and sparse x € S n ~ 1 
one needs to take another union bound (see proof of [21, Lemma 3.3]), which also fails here. 

Instead, using Chernoff’s bound we show that there are large submatrices inside A n such that 
one part of those submatrices contain only one non-zero entry per row, and the rest of them are 
zero (see Lemma 3.2). This essentially means that ( A n x)i is just aijXj , for some j i, when x 
is a sparse vector. Thus contributions of different coordinates of x do not cancel, which allows 
to avoid using the e-net argument at this step. This is enough to control ||A n x || 0 for very sparse 
vectors. More specifically, this argument works for 0(p^ 1 )-sparse vectors of unit norm. These 
estimates automatically extend to compressible and dominated vectors with m = 0(p~ l ). To carry 
out the program, one needs to improve these estimates for cn-sparse vectors, for some c € ( 0 , 1 ). 
To this end, we need some estimates on the small ball probability. For such estimates, the following 
definition of the Levy concentration function turns out to be useful. 

Definition 2.3. Let Z be random variable in R n . For every e > 0, the Levy concentration function 
of Z is defined as 

C{Z,e) := sup P(|| Z — u\\ 2 < e), 

u £ S" 

where ||-|| 2 denotes the Euclidean norm. 

Once we obtain necessary estimates for cn-sparse vectors, we extend them for compressible and 
dominated vectors using the e-net argument and the union bound. 

Next we need to bound the infimum for incompressible vectors. To this end, we need the following 
Lemma of [21] (see [21, Lemma 3.5]). 
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Lemma 2.4 (Invertibility via distance). For j £ [n], let A n j € M n be the j-th column of A n , and 
let H n j be the subspace ofMF spanned by {A n ^,i € Then for any £,p > 0, and M <n, 


( 2 . 1 ) 


P 


I inf 

\x G Incomp ( M, p) 


A n x 


< £ P 2 \f^) - ( dist (^«,j, H n .j) < Py/pe 

V nJ 3 = 1 


Remark 2.5. Lemma 2.4 can be extended to the case when the event on the LHS of (2.1) is 
intersected with an event 11, and in that case Lemma 2.4 continues to hold if the rhs of (2.1) is 
replaced by intersecting each of the event under the summation sign with the same event 11. In the 
proof of Theorem 1.1, we will use this slightly more general version of Lemma 2.4. Since the proof 
this general version of Lemma 2.4 is a straightforward adaptation of the proof of [21, Lemma 3.5], 
we omit the details. 


Proceeding similarly as in [21] we see that we need to find small ball probability estimates 
for incompressible vectors. However, the small ball probability estimates used in the proof of 
Proposition 3.1 is too weak for this purpose. The rich additive structure of the incompressible 
vectors is helpful here. For a vector x € M n , when each coordinate of x is rational, a suitable 
measure for the additive structure in x is the least common multiple of the denominators of the 
coordinates. Generalizing this idea, when the coordinates of the vector x are real, a notion termed 
as least common denominator (lcd) was introduced in [21, 23], to capture the additive structure 
in x. In our current set-up of sparse matrices, adapting their definition, we have the following 
definition of LCD: 

Definition 2.6. For x £ S'” -1 , the LCD of x is defined as 

D(x) := inf j<9 > 0 : dist(0x, Z n ) < (5op)~ 1 / 2 \J\og + (^JlfypO)^, 
where Jo £ (0,1) is an appropriate constant (see Remark 2.1 below for the choice of 5 q). 

Remark 2.7. We note that there exist Jo, e' 0 £ (0,1), such that for any e < £q, £(£J, e) < 1 — Jop, 
where £ is a random variable with unit variance and finite fourth moment, and J is a Ber(p) random 
variable, independent of each other (for more details see [32, Lemma 3.3]). We choose this Jo i n 
Definition 2.6 above. 


Using the LCD of a vector, one can improve the small ball probability estimates (cf. [32, Theorem 
6.3]). Using this, and proceeding as in [21] vectors with large LCD are taken care of. To deal with 
the vectors of small LCD, we split them into level sets first. Inside each level set we use the small 
ball probability estimate once again, and a careful e-net argument is carried out (based on the 
value of the LCD in that level set) to obtain necessary bounds. After which the result follows by a 
union bound. 

In the dense set-up one can show that LCD on the set of incompressible vectors under consider¬ 
ation is Q(^/n) (see [20, Lemma 6.1]). However, in the sparse set-up one cannot guarantee similar 
kinds of lower bounds on LCD due to weak control on the compressible vectors. To this end, we use 
a lower bound LCD depending on H-]^ (see Proposition 4.4), demanding some control on ||• ||^ on 
the incompressible vectors which requires the introduction of dominated vectors. 


3. Compressible and dominated vectors 

In this section we obtain a lower bound on the infimum of || (A n + D„)x || 2 over compressible and 
dominated vectors. More specifically, we will prove the following proposition in this section. Before 
stating the result let us recall that for any 7 £ R, [ 7 ] denotes the ceiling of 7 , i.e. it is smallest 
integer greater than or equal to 7 . 
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Proposition 3.1. Let p satisfy (1.5). Denote 


log 1/ (8p) 

log s/pfi 


Let A n be an nxn matrix with zeros on the diagonal and off-diagonal entries a l ,j = dij where 
5ij are i.i.d. Bernoulli random variables with P (5ij = 1) = p, and fij are centered i.i.d. random 
variables with unit variance and finite fourth moment. Let K,R > 1, and assume that D n is a 
non-random diagonal matrix with real entries such that \\D n \\ < Ryfpn. Then there exist constants 
0 < C3 \,c% \, C % 1 < 00, depending only on K,R, and the fourth moment of 

such that for any p _1 < M < C 3 \n, 

P(3x € Dom(M, (C 3 \ (K + R))~ A ) U Comp(M, p ) 

||(A n + D n )x\\ 2 < C% i(K + R)py/np and ||A n || < Ky/pn) < exp(— S 3 \pn), 
where p = (C 3 \{K + R))~ io ~ 6 . 


The proof splits into two steps. First, we consider vectors which are close to (l/ 8 p)-sparse. As 
explained above, for such vectors, the small ball probability bound is too weak, which forces us to 
use a method specially designed for sparse matrices. At the second step of the proof, we consider 
vectors which are close to M- sparse, but not to (l/ 8 p)-sparse. For such moderately sparse vectors, 
a better control of the Levy concentration function is available. 


3.1. Vectors close to very sparse. We first establish a uniform lower bound for \\(A n + H n )x || 2 
over the sets of unit vectors which are close to (l/ 8 p)-sparse. Our approach is based on the 
observation that for any such vector there is a relatively large number of rows of A n which have 
exactly one non-zero entry in the columns corresponding to its support. Unfortunately, this number 
is insufficient to use the union bound over all supports. This forces us to use a simple chaining type 
argument. The support of the vector is divided into blocks of increasing sizes. We use one of these 
blocks carrying a substantial part of the £2 norm of the vector to obtain the small ball probability 
bound, and show that the contribution of the other blocks does not destroy it. 

To run this procedure efficiently, we need a combinatorial lemma about the structure of the set 
of rows having exactly one non-zero entry in the columns corresponding tho the chosen block. To 
this end, we divide the set of these columns in two parts, and look for those rows for which the first 
part has exactly one non-zero entry, and the second one has only zeros. Such zero rows would be 
useful in showing that the contributions of different coordinates within the selected block add up 
correctly. For ease of writing, for any positive integer 7 < n, let us denote (^) to be the collection 
of all subsets of [n] of cardinality 7 . Now we are ready to state the combinatorial lemma. 


Lemma 3.2. Let A n be an n x n matrix with zeros on the diagonal, and has off-diagonal entries 
aij = where 6ij are i.i.d. Bernoulli random variables with P(<5jj = 1) = p, where p satisfies 

(1.5), and are centered i.i.d. random variables with max{P(£,.j > l),P(£jj < —1)} > Co for 
some positive constant cq. For k € N and for J,J' C [n], let Ac J denotes the event that there 
are at least cnpn rows of the matrix A n containing exactly one non-zero entry a; t ,j in the columns 
corresponding to J, for which |ajj| > 1, and all zero entries in the columns corresponding to J l . 
Denote 


m = m(A) := Ky/pn A 


8 P 


Then, there exist constants 0 < c 3 2’ c 3 2 < 00 j depending only on cq, such that 
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n n 


n 


A 


>«<(8 Py/i™) J vi Je( [ ™ 1 ) J'e( [ " ] ), JnJ '=0 


J,J' 

5 3.2 


> 1 - exp (-c 3 2 pn). 


Before going to the proof let us mention that we will often write 7 instead of |_ 7 j (the floor of 7 , 
i.e. the largest integer less than or equal to 7 ), even when 7 is not an integer. This will not make 
any changes in the proof. We adopt this approach to simplify the presentation. 

Proof. Fix k < (8 Py/pn)~ l V 1 and a set J £ (^). Let / 1 (J) be the set of all rows of A n containing 
exactly one large entry in the columns corresponding to J: 

/ 1 (J) := jz £ [n] : |ojjJ > 1 for some j t £ J, and ajj = 0 for all j £ J \}|• 

Similarly for a set J' £ (^) we define 

■= jz € [n] : ajj = 0 for all j £ J 7 j. 

To prove the desired result we first show that the cardinality of the subset / 1 (J) must be at least 
cnpn with large probability, for some positive constant c. Then using Chernoff’s bound we argue 
that I := \I 1 (J)nI°(J')\ is also large with large probability. Finally taking union bounds over the 
set of choices of J, and over k, we complete the proof. 

To this end, we begin by obtaining a lower bound on P(z £ / 1 (J)) for every z £ [n]. Recall that 
the diagonal entries of A n are zero, and therefore we need to consider the two cases z € [n]\J, and 
z £ J separately. 

Now, by the independence of the random variables {chj}, and and the fact that max{P(£jj > 

l),P(£ij < —1)} > Co, it follows that, for every i £ [n]\J, 

(3.1) P(i € L 1 (J)) > cq\J\ • p( 1 — p )^ _1 > conp(l — up) > ~^ K P- 
Similarly for every i £ J, whenever | J\ = k > 3, we also have that 

P(* € I\J)) = P(i £ /V\{i})) > c 0 (| J\ ~ 1) • P( 1 - Pf l ~ 2 > co(« - l)p(l - up) > yKp. 

When \J\ = 2, one can again show that P(i £ / 1 (J)) > c^p = ¥fK,p, for any i £ J. Therefore, 
applying Chernoff’s inequality, whenever k > 2 , we obtain 

(3.2) P(|/ 1 (J)| < ^npn) < exp(— QKpn), 

for some positive finite constant ci. For | J\ = k = 1, we note that J n d 1 (J) = 0 . Therefore 
shrinking c\ if necessary, and applying Chernoff’s inequality again, we also obtain that 

PG^-OI < ^jP(n- 1)) < IP(|T 1 («7)| < jpn) < exp(-cipn). 

This establishes (3.2) for all values of k. Next for a fixed set J' £ (^), and for any i £ [n]\J / , we 
have that 

(3.3) P(z € = (1 -p) |J/| > l-p-\f\ = 1 - p-m> 

Similarly, for i £ J ', 

P(i £ I°(J')) = (1 - p ) |J ' l_1 > 1 -P ■ (m - 1 ) > 
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Thus, for a given / C [n], the random variable |/\/°(J / )| can be represented as the sum of indepen¬ 
dent Bernoulli variables taking value 1 with probability either q± or < 72 , where maxjgi,^} < pn 
Note that E|/\/°(J')| < pm • |/| < |/|/4 by the assumption on k, and m. Hence, by Chernoff’s 
inequality 


P(IA/V')l>i|/|)<exp(-H| log (_i_ 


Therefore, for any / C [n] such that |/| > ^ unp , we deduce that 


3J' € 


^ such that |/°(J / ) n I\ < ~^K,pn^ 


< 


J2 m\ i0 ( j, )\>^\) 

j, e (N) 


< 


n 

v m 

where 


, |/| , 
eXP ( “Ye log 


4pm 


< exp ( m • log ( — J ——Kpn ■ log 


m / 


64 


4pm 


= exp (—Kpn ■ U ), 


tt c ° 1 
^ := 64 l0g 


4pm 


m /en\ 

log — . 


npn 


m / 


We claim that U > co/100. To prove this, consider two cases. First, assume that p > \n 17,3 . In 
this case, k = 1 and m = Therefore, for all large n, 

u = §7 lo g 2 - ^r~ log(en ■ 8p) > ^ log 2 - 2n _1/3 • log(8en) > 

o4 o4 1UU 

where the first inequality holds by the assumption on p. 

C-t 1 logn 1 _i /q 1 

Now, assume that LmL n -< p < 7 . Then l < k < g / = , and m = K^fpn. Denote 


i 


a = 


4 npy/pn 

The assumption on k implies that a > 2. Hence, 


TV c o 1 

C/= 64 1 ° g 


1 


1 


Anpy/pn J y/pn 

= — log a -—— log(4epnct) 

64 y/pn 

1 , 1 


log 


en 


Ky/pn 


co , 

= — log a — 

64 y/pn 


log a — 


y/pn 


(log(4e) + log (pn)). 


Now noting that by the assumption on p we have pn —> oo as n —> oo, and using the fact that 
x~ 1 ^ 2 logx —> 0 as x —>• oo, we conclude that U > for all large n. This proves that, for any 
I C [n] with |/| > ^ Kpn , we have 

Vi 


3f <E 


m 


^ such that |/°(J / ) nl| < -^-Kpn'j < exp(— C 2 Kpn), 


for some positive finite constant c^- Now for a set J € (^) define 
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Since J and J' are disjoint, it is easy to note that the random subsets I l (J) and are 

independent. Using (3.2) this now implies that 

pj< £ p (i\J) = i) 

I(l[n], Kpn 

+ £ F(I\J) = € f such that |I°(J') n I\ < j K P n 

IC[n], \I\>ffKpn 

(3.4) < P(|/ 1 (J)| < ^jupn) + exp(— C2«pn) £ P(/ 1 (J) = I) < exp(— c^npn), 

I<Z[n], \I\>^-Kpn 

for all large n, where C 3 is another positive constant. 

The rest of the proof consists of taking the union bounds. First, using the union bound over 
J £ (^), setting C 3 2 = cq/8, and enlarging C 1 if needed, we get that 

' U U (A£ J ) c ] < ( j exp(— c^Kpri) < exp(Klogn — csnpn) < exp (—c' 3 npn), 

for some positive finite constant C 3 . The last inequality here follows from assumption (1.5). Finally 
taking another union bound over k we obtain the desired result. □ 


We use Lemma 3.2 to establish a uniform small ball probability bound for the set of dominated 
vectors. Before formulating the result, note that the condition p < c(K + R)~ 2 introduced at the 
beginning of Section 2 ensures that 1/(8 p) > 1. 


Lemma 3.3. Let A n be the matrix defined in Proposition 3.1, and let p satisfy (1.5). Denote 

(3.5) to = [LL/M 

log y/pn 

Fix K,R > 1, and let D n be a real diagonal matrix with ||_D n || < R^Jrvp. Then there exist constants 
0 < C 3 3 , C 3 3 , C 3 3 < oo, depending only on the fourth moment of such that 

P^zh € Dom((8p) -1 , (C 3 3 (K + R )) _1 ) such that || (A n + H n )x || 2 < (C 3 3 (K + R))^ e °^/np 


and 11 A n 11 < K^/pn 


< exp(— c 3 - 3 pn). 


Proof. We first prove the result for Sparse((8p) -1 ) vectors of unit norm, and then we show that 
the estimates are automatically extended to Dom((8p) -1 , (C(K + i?)) -1 ) vectors, for some large 
constant C. Our proof strategy for sparse vectors depends on p. If p > (l/4)n -1 / 3 , we apply Lemma 
3.2 with n = 1 and m = The range p < (l/4)n -1 / 3 requires a different approach since the we 
cannot reach the level of sparsity 0(p~ 1 ) in one step. Instead we use Lemma 3.2 with different 
values of k depending on the distribution of coordinates of the vector. Assuming that the event 
described in this lemma occurs, we split the vector into blocks with disjoint support. One of these 
blocks has a large £2 norm. By the assertion of Lemma 3.2, a large number of rows of the matrix A n 
have exactly one non-zero entry in columns corresponding to the support of this block. This will be 
enough to conclude that ||(A n + H„,)a;|| 2 is bounded below, for x £ Sparse((8p) -1 ). Note that while 
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applying Lemma 3.2 we need rriax{P(Lj' > l),P(£jj < —1)} > cq. Since s are centered and 
have unit variance it is easy to check that max{P(£jj > 1 /2), P(£, ^ < —1/2)} > 0, and therefore 
without loss of generality we can work with a scaled version of £ ? ;j. Since the fourth moment of 
s are bounded, upon an application of the Paley-Zygmund inequality (see [16, Lemma 3.5]), we 
further obtain a uniform lower bound on the value of cq. 

We now begin with large values of p, that is, p > (1/4)n” 1//3 . In this case, Iq = 1, and we prove 
that there exist constants Co and Cq such that 

P(zte G Sparse((l/ 8 p)) n 5 n_1 such that ||(H n + L> n )x || 2 < y/c^np and ||A„|| <K^Jpn) 

(3.6) < exp(— c' 0 pn). 

For k G [n], set J& = { k } and J' k = supp (x)\Jk- Let A be the event that for each k G [n] there 
exists a set R C [n] of rows such that \Ik\ = C 3 2 V n i and for any i G Ik, \a,ik\ > 1 and a\j = 0 
for j G supp(x)\{&;}. The definition of the sets immediately implies that Ik n Ik' = 0 for 
k A k' € supp(x). By Lemma 3.2, P(A) > 1 — exp(— C 3 2 pn). This shows that on this large set A, 
we have that 

(3.7) || (A n + -D n )x || 2 > 'y ] y ] | {{A n + D n )x)j . 

fcSsupp(o;) ie/fc 

To get rid of the diagonal matrix D n , let us consider only the coordinates i G Ik\ supp(x). For these 
coordinates, {{A n + D n )x)i = (A n x)i. The assumption on p implies \Ik\ S> |supp(x)| = O(p^ 1 ), 

and so |/fc\supp(x)| > ^-2 P Hence, 

<3.8) i|(a„+d„)i||j> £ £ \(k^> y. = 

/cGsupp(ic) supp(:r) /cEsupp(aj) 

Thus, setting c ' 0 = C 3 2 , and Co = -4^ we have (3.6). This estimate can be automatically extended 
to the set Dom(( 8 p) -1 , {C{K + i?)) -1 ) provided that the constant C is large enough. Indeed, 
assume that 

(3.9) || (A n + D n )x \| 2 < t^Vco pn 

for some x G Dom(( 8 p) -1 , ( C{K + R))~ l ). Set m = ( 8 p) _1 . Since x G S’” -1 , it is easy to note that 
||a ; [m+i:n ]|| 00 < m~ 1/2 . Hence, 

||®[m+l:n]|| 2 < ( C ( K + ^)) _1 V™ \\x[m+l:n] || ^ < (C(K + R) ) _1 , 

and therefore 


| {A n -f- D n )x[i :m } H2 — || {An T D n )x\\ 2 + (|| A n || + Ry/np) ||®[m+l:n] | 


2 _ 3 

< ^CQpn + (K + R)^pn- ( C{K + it , )) _1 < - \fc 0 pn, 


when C > -i=. Furthermore 


Vco 

{An + D n ){x[ l :m ]/||x[i :m ] || 2 ) 


|| {A n + D n )xq ;m ] | 


< {K + R) 


1 — a: 


[l:m] 


< 


y%pn 


4 
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Since ||^[i :m ]|| 2 € Sparse((8p) _1 ) n 5 n_1 , combining the above steps we note that the in¬ 

equality in ( 3 . 9 ) holds only in A c . Therefore, setting C3 3 = C3 3 = -^t=, and C3 3 = C3 2, we prove 
the lemma for p > (l/ 4 )n -1 / 3 . 


o 1 1 iogn _i /o 

We now consider the more difficult case, —“—-— < p < (l/4)n . Note that for such values of 

P, 

1 

- > 1 . 

8 P^/pn 

To simplify the notation in the proof below, assume in addition that ( pn) e °/ 2 = i.e. the integer 
part in the definition of £q is redundant. 

Consider x € Dom(( 8 p) _1 , ( C(K + i?)) -1 ). Let us rearrange the magnitudes of the coordinates 
of x and group them in blocks of lengths ( pnY / 2 , where £ = 1,..., Zo- More precisely, set 

Z t = X [(pn)( e - 1 )/ 2 +l:(pny/ 2 } 1 ^ 

and 


^ 0+1 X [(pn) e o/ 2 +l:n]' 

For simplicity of notation denote m = ( 8 p)~ l = ( pnY■ Let us show that one of the blocks 
z \,..., Z£ 0 has a substantial i 2 norm. Note that 

II Z £o~hl II2 — (^3.3^ + ^)) 'V^ll^o+llloo — ^(^3.3(-^ + -^)) 1 || x [m/2:m]|| 2 

(3.10) < V2(C , 3 . 3 (-^ + -R )) _1 ll^o l | 2 , 

where in the last step we use the fact that np —>• 00 , as n -> 00 , and so the support of zi 0 contains 

that of x\ m / 2 :m] ■ As x € 5 n_1 implies Y^i=i W Z A\\ = 1j we have 

4 

^||^|| 2 >l- 2 (C 3 . 3 (K + i?))- 2 . 

t=i 

On the other hand, for any K > 1 and i? > 0, if C 3 3 > 2 then 3 + AO) -2 ^ < 1- Thus 


E( c 3.3(^+fl)r 2 '<Eii^iil 

e= 1 f=i 

which implies that there exists £ < £q such that ||^|| 2 > (C3 3 (K + i?)) . Let be the largest 

index having this property, and set u = v = X^=V+i z t- First consider the case when 

£* < £q. Then by the triangle inequality and (3.10), we have that 

4+1 

H 2 < E \\ z m\\ 2 <^(C x3 (K + R))-^ +1 \ 

m=£*+l 

Let k = (pn)^* -1 )/ 2 . Note that 

« < (rrf °- 1)/2 < - J—. 

8p^pn 

3 when i — 1 by a slight abuse of notation we take zi = X[i-.^np\. 
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We will apply Lemma 3.2 with this choice of k. Split the support of u into (pn) 1 / 2 blocks of equal 
size k. To this end, define := 7 r“ 1 ([l, ( np) 1 */ 2 ]), where n x is the permutation of absolute values of 
the coordinates of x in an non-increasing order. For s € [( pn ) 1 ^ 2 ], define J s : = 
and set J' s = Lp t \J s . Since |J'| < \LgJ = Ky/pn, we apply Lemma 3.2 to get a set A with large 
probability, such that on A, there exists subset of rows I s with \I S \ > c%2 K P n f° r s £ [s/P™], 
such that for every i £ I s , we have |aj j0 1 > 1 for only one index jo € J s and aij = 0 for all 
j € J s U J'\{j 0 }. It can further be checked that I\, I 2 , ■ ■ • , I^/pn are disjoint subsets. Therefore, on 
set A for any i € I s , 

\(A n u)i\ = \aij o u(j 0 )\ = \a itjo \ ■ \u(j 0 )\ > 

Here we used that n x is a non-increasing rearrangement. Now note that for i ^ supp(u), 

((A n + D n )u)i = (A n u)i , and supp(u) = Ky/np < c 3 ^np, 
as long as np —>• 00 . Therefore, 


(A n A Dn)u 


(pn) 1 / 2 _ (pn) 1 / 2 

2 > X X (( A n u)i ) 2 > C3 -‘A pn ^(tt-^sk))) 2 

s=l i£l s \s\ipp(u) -5=1 


> 


C3.2 P n 
2 


(pn) £ */ 2 

X ( a: ( 7r * 1 ( fe ))) 2 

k =( pn )(^*—- 1 )/ 2 


(3.11) 


C 3.2 P n 


\ z l* 


,2 ^ c 3 2 pn 
L — o 


(C 3 ' 3 (K + R))- 2 **, 


where the third inequality uses monotonicity of the sequence {\x(n x 1 (fc))|}fc =1 - Combining this 
with the bound on ||u|| 2 , on the set A, we get that 


|| (A n + -D n )x || 2 > || (A n + H n )u || 2 — \\A n + D n || • ||u 


> ]/^f^(Cs. 3 ( K + R))~ L ~( K + R)Vp^ • 2V2 (C 3 . 3 (K + R))-^ 

> y/pn(C 3 3 (K + R))~ e *y/pn, 


where the last inequality follows if the constants C 3 3 , C 3 3 are chosen large enough independently 
of £*. 

Now it remains to consider the case when i * = £q. Note that in this case, using (3.11), we have 
that 

||( A * + Ai )«|| 2 > < J C ‘ S - A pn H ^ 0 || 2 , 

and from (3.10), we have ||u|| = ||;^ 0 + i|| < \[2 (C 3 3 (A' + R))^ 1 ||z£ 0 || 2 . Now proceeding similarly 
as before, on A, we obtain that 

||(Ai + T> n )x|| 2 > y/pn(C33(K + R)y e °^pn. 

Since by Lemma 3.2, P(„4.) > 1 — exp(— c 3 > 2 P 7 i), the proof is completed. □ 


We now extend the result of Lemma 3.3 to compressible vectors. This step requires only simple 
approximation. Recall that Sparse(( 8 p) _1 ) n 5 n_1 C Dom(( 8 p) _1 , (C 3 3 (A" + A)) -1 ). 
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Lemma 3.4. Let A n be the matrix defined in Proposition 3.1, p satisfy (1.5). Fix K,R > 1, and 
let D n a non-random diagonal matrix with real entries such that ||D n || < Ry/np. Set 

p:= (C 3 ' 3 (K + R))~V° +1 \ 

where Iq is defined in (3.5). Then 

P^3x € Comp(( 8 p) -1 , p) such that || (A n + D n )x || 2 < —— y/np 

and 11 A n 11 < Ky/fmj 

< exp(-c 3 _ 3 pn). 

Proof. Denote 

Lip := € Sparse(l/( 8 p)) n 5 n_1 ||(^4 n + ^n)^|| 2 > pC 3 3 {K + R)y/pn and ||^4„|| < Ky/pn |. 

Then on the set Ll p , for any x € Comp(( 8 p) _1 , p), we can find x € Sparse(l/( 8 p)) such that 
||(A n + D n )(x/||x || 2 )|| 2 > pC 3 3 {I\ + R)y/pn , and ||x — x\\ 2 < p. This also implies |1 — ||a:|| 2 | < p. 
Therefore 

||(^4n T -^n )-*'|| 2 11 (^4^ + D n )(x/ ||aj || 2 )|| 2 — H^n T Dn 

when C 3 3 > 4. Since by Lemma 3.3, P(0 p ) > 1 — exp(— c 3 3 pn), the result follows. □ 

3.2. Vectors close to moderately sparse. Lemma 3.3, and Lemma 3.4 provide uniform lower 
bound on ||(^4„ + D n )x || 2 for vectors which are close to very sparse vectors. To prove Proposition 
3.1, we need to uplift these estimates for vectors which are less sparse. For such vectors, we 
employ a different strategy. These vectors are sufficiently spread. This allows us to obtain a small 
ball probability estimate which is strong enough to use the e-net argument. To this end, Levy 
concentration function turns out to be useful. Recall the Levy concentration function is given by 

C(Z, e) := sup P(|| Z — u\\ 2 < e). 

ueR n 

Below we prove several results about Levy concentration function, which are subsequently used in 
the proof Lemma 3.8, and eventually lead to the proof of Proposition 3.1. 

Lemma 3.5. Assume that the matrix A n satisfies the conditions of Proposition 3.1. For any 
x G M n , let us denote x^ to be the vector obtained from x by setting its i-th coordinate to be zero. 
Then there exists a positive constant c 3 5 , depending only on the fourth moment of {£ij}, such that 
for any x € and any i £ [n], 


x 


x — 


\x\\ 


ll^-n + 11*^ ^1 
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Proof. We begin with the standard symmetrization. Let and £[,... ,£f n be independent 

copies of Si,... ,8 n and £ 1 ,... ,f n . Since the diagonal entries of A n are zero, for any b € R and 
t > 0 , we have that 


(3.12) 


P 2 (| (A n x)i - b\ <t ) 


y ^ ^ j x 3 

J 6 [n]\{*} 


< t 


y^ SjZj-t'j 

ie[n]\{*} 


Xn — 


< t 


< 


< 2 t 


E 

ie[n]\{i} 

Denote 0 3 = 8jfj — <$'■£'•. Then E 9j = E0| = 0, E6* 2 = 2p, and E Oj < cp, for some constant c, 

depending only on the fourth moment of Set S = Ylje[n ]\{i}0j x j- Then ES 2 > _p H 2 ’ 

and 

E S 4 = ^ E ^ 4 -*i+ EflJsJ-Efl, 2 

je[n]\{i} i^e[n]\{«} 

< cp |k(i)||L ■ Ikiolla + 4-P 2 Ikiolla ■ 

Then the Paley-Zygmund inequality (cf. [16, Lemma 3.5]) 




P(|S| < i) < 1 - (E ^. S 4 i2) ‘ 


yields 


ni5i<-vp||x w || 2 )<i- 


dp 


x (^)\L/\\ x wV 2+ p , 


for some constant d < 1, depending only on c. Combining this with (3.12), and setting C 3 5 = d/ 2, 
we obtain 


C[fA n x)i,-Jp ||s (i) || ) <Jl- 


c'p 


< 1 - 


Ikwlloo / lh(*)ll 2 ) 2 

_ %5P _ 

(Ikolloo / lkc)ll 2 ) 2 


□ 

To pass from an estimate for one coordinate to estimate for the norm, we need the following 
elementary lemma. 

Lemma 3.6. Let V \,..., V n be non-negative independent random variables such that P(V) > 1) > q, 
for all i € [n], and for some q € (0,1/2). Then there exist constants 0 < C 3 g, Cg g < oo, such that 
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Proof. For a positive constant (5, denote L(/3) := ■ Let J := {j E [n]| Vj > 1}. If Vj — 

L(/3), then \J\ < L((3). Hence, 

< 3I3 > p (t V* S W)) S (1 - s exp (Kffl log - = ■ log ■ 

Since LEli log ^ 0 uniformly in g € (0,1/2) as f3 —> 0, we can choose fi small enough such 

that the RHS of (3.13) can be made smaller than exp(— c'qn) for some positive constant c'. This 
completes the proof. □ 


Combining Lemma 3.5 and Lemma 3.6, we obtain the following corollary. 


Corollary 3.7. Let A n be as in Proposition 3.1. For every x E M n and i E [n], define to 
be the vector obtained from x by setting its i-th coordinate to be zero. Then for any a > 1, there 
exist /3 ,7 > 0, depending on a and the fourth moment of {fij}, such that for x E M n , satisfying 

sup ie[n] IML/IMU - a VP’ we have 


C 


( A n x, j3 ■ y/pn inf 
V *e[n] 



< exp(—yn). 


Proof. Fix any y E M n , and let Vj = 16 a (( A n x)j — Vj) 2 - Since by our assumption, 

p|| x 0') 112 


inf 

iS[n] 


_ %5P _ 

(Iholloo / IkolU ) 2 


c 3.5 

a 2 + 1 ’ 


the claim then follows from Lemma 3.5 and Lemma 3.6 applied with 


q = 


c 3.5 

a 2 + 1 



□ 


Equipped with these results on Levy concentration we now prove uniform lower bound on 
||(x4 n + H n )x || 2 for vectors in Dom(M, C 3 j(K + R )~ 4 ). 

Lemma 3.8. Let A n be the matrix defined in Proposition 3.1, p satisfy (1.5), and let £q be as in 
(3.5). Fix K,R > 1, and let D n be any non-random diagonal matrix with real entries such that 
11 D n 11 < Ryfinp. Further denote 

P-= (Cm(K + R))~V° +1) . 

There exist positive constants C 3 3,03 3,63 3,03 3 , depending on E[£-], K, and R, such that for 
any p~ 1 < M < C 3 gn, 

P^EIx E Dom(M, (C 3 3 (K + i?))~ 4 ) such that || (A n + _D n )x || 2 < (C 3 3 (K + R))~ 4 py/np 

and ||x4 n || < Ky/fmj 

< exp (-c 3 _ 8 pn). 

Proof. Let c < 1. Denote for shortness m = ( 8 p) -1 , so m < M/2. By Lemma 3.3 and Lemma 3.4, 
it is enough to obtain a uniform lower bound for all vectors from the set 

W := Dorn (M, {C^(K + -R))~ 4 )\(Comp(( 8 p) _ 1 ,p) U Dom(( 8 p) -1 , (C^K + i?)) -1 )). 
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We begin with a smaller set 

V := Sparse(M) fl 5 n_ 1 \^Comp((8p)^ 1 , p) U Dom((8p) _1 , (C 3 3 (AT + A)) -1 )^. 

First let us consider the case, p > (l/4)n -1 / 3 . In this case the proof is based on the straightforward 
e-net argument. Note that in this regime of p as above, £q = 1, and so p = (C 3 3 (K + R))~ 2 . Since 
for any x € V, x ^ Dom((8p) -1 , (C3 3 (K + R ))~ 1 ) we have that 

|| 3 '[m+l:M] |L 


< C 3 . 3 (AT + R)\/&p. 

| a '[m-|-l:M]\{i} / || a '[TO+l:.M]\{i} 11 2 


||^ ; [m+l:M] 11 2 

However, to apply Corollary 3.7 we need to find sup ie[n] ^ / ||x [m+1:M]VW || 2 ). This 

can be obtained easily. Note that ||®[ m +i : M]|| < \/y/m, and we have x ^ Comp(m,p), which in 

turn implies that ||x[ m +i : M] || 9 > P- Therefore, 

(3.14) ||^'[m+l:Af]\{i} H 2 — ||®[m+l:Af] 11 2 — 7J ||®[m+l:Af] 11 2 " 

Here the last inequality follows from the assumption p < c{K + A) -2 , for sufficiently small c, which 
we made at the beginning of Section 2. Therefore we have 


sup 

iefnl 




<4 C z ^K + R)y/p. 


H[m+l:M]\{i} || 2 

Now by Corollary 3.7, enlarging C 3 3 if needed, we deduce that 

C((A n + D n )x, (C 3 3 (A' + R)y 3 ^/pn inf ||x[ m+ i:M]\{i}|| 2 ) 

iS[n] 


< £{A n x, (C 3m3 (K + A)) 3 y/pn inf ||xr m+ i :M] \ W || ) < exp(-c'n), 

i£[n\ 

for some constant c' depending on AT and R. Using (3.14) again, and enlarging C 3 3 again, we 
further deduce that 

(3.15) C((A n + D n )x,(C 3 3 {K + R))~ 3 ^/pn\\x [m+1:M] \\ 2 ) < exp(-c'n). 

Now we will use this estimate of the Levy concentration function to show that infimum over V is 
well controlled. To this end, we will use a e-net argument. Since V C Sparse(M), we begin by 
noting that the set V is contained in S '™” 1 intersected with the union of coordinate subspaces of 
dimension M. Hence, for e = (C 3 3 (A” + A))~ 4 p, there exists an e-net Af C V of cardinality less 
than 

M 


(3.16) 


n 

M 


< exp c 3 8 n log 


3e 

c 3.8 e 


Here we used the assumption M < c 3 g n. We can choose the constant C 3 g sufficiently small so 
that the |A7| < exp((c / /2)n). Therefore using the union bound, we show that, 

P(3x G Af | \\(A n +D n )x\\ 2 < (C 3 3 (A + A)) _3 ^n ||x [m+ i :M ]|| 2 ) < exp(-(c'/ 2 )n). 

The proof in this case is finished by approximating any point of W by a point of Af. Indeed, assume 
that for any x € Af, 

\\(A n + D n )x\\ 2 > + R)) 3 v / pA|| %[m+l:M] 11 2 ‘ 
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Let x' € W, then we can find x € M such that II ( x {i-m]/W x '[i-m] II 2 ) — X W 2 — £ • Let us show that 
x approximates x' . Using m < M/2 and the fact that all coordinates of x', M+ x . n j have smaller 


absolute values than those of , we conclude that 


Vm 


X [M+l:n] 




Now recalling that x' € Dom(M, (C3 3 [K + R)) 4 ), we have 


J [M+l:n ] 


<{c^{k + r))~ a Vm 


J [M+l-.n\ 


<V2 (C 3 . 3 (K + R)) 


-4 


X [m+1:M] 


Next using the fact that ||(®[i.m]/II x/ [i-m] II 2 ) ~ x ll 2 — applying the triangle inequality, we also 
obtain 


X [m+1:M] 


< 


'[1 :M] 


X [m+1:M] 11 2 + < || X [m+1:M] || 2 + S. 


For any x € AT, x ^ Comp (m,p), we further have ||iC[ m +i : M] || 2 > P = (C 3 3 (K + i?)) 4 £. Using the 
two previous inequalities we further deduce 


x 


[M+l:n\ 


<V2(C 3 ' 3 (K + R)) 


-4 


X [m+1:M] 


— 2(C , 3.3(-K’ + R)) 4 ( 11 2 + e ) — ^(^3.3 + -ft)) 4 || x [m+l: 


M] | 


and 


||x-x '|| 2 < x- (^ 1 :M] /i|x' [1:M ]|| 2 ; 


+ 


1 _ ll X |l:M]ll 2 


+ 


X [M+l:n] 


< £ + 2 


X [M+l:n] 


< £ + 8(C 3 3 (K + i?)) |p[ m +l:M] 


<9(C 3 .3(^ + J R))- 4 ||x [m+1:M] || 2 

Thus, choosing C3 3 sufficiently large, by the triangle inequality, 

• 11 x — x' | L 


| (A n + D n )x / || 2 > || (A n + D n )x|| 2 — (||-A n || + ||-ZD. 

1 
2 


> l( C 3.3( K + R )) 3 VP™\\ x [m+l:M]\\ 2 > l(C3. 3 {K + R)) 3 ^pfip. 


Assume now that Ll^ g < p < (1/4)n _1//3 . In this case, the proof uses a more delicate £-net 
argument. To this end we combine two nets: a coarser one for small coordinates, and a finer one 
for large ones. 

Let I,J C [n] be disjoint sets such that \I\ = m, \ J\ = M — m , where m and M are the same as 
in the previous case. Let £, r > 0 be numbers to be chosen later. The sets 

Bj := {u € B%\ supp(u) C I}, 

and 

Rj := {u € 5 ,n_1 | supp(u) C J and IMI^ < AC 3 3 (K + R)y/p}, 
admit an £-net A/} C Bj and a r-net J\fj C Rj of cardinalities 


m < (- 


id 

and \Afj\ < (-J 
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Let ATq be an e-net in [ p/y/2 ,1] C R, and let 

Mpj := {u + Iw | u G A/}, w G A fj, l G A/o}, 

and 


M := 


u u 




. /:/C[n], J:Jc[ri], 
|7|=m |J|=M-m,/nJ =0 


We now show that Ad serves as an appropriate net for W. To this end, decompose x G W 
as x = u x + v x + r x , where u x = contains m coordinates of x with largest absolute values, 

v x = X[ m+ i-M] the intermediate ones, and r x = x^M+i-.n] the rest. The assumption x ^ Comp(m, p )U 
Dom(m, (C 3 3 (K + i?)) _1 ) implies that 
(3.17) 


||^'[m+l:n] 11 2 

Since x G W, 


X || 2 + 11 ’’re 112 ^ P> and 


V x 



II2 


< C^K+R)™,- 1 / 2 = C 3 ' 3 (K+R)^. 


IMa < (C 3 . 3 (K + R))~ 4 Vm llr.lL < 2(C 3 3 (K + R))~ 4 \\v x \\ 2 , 
where as in the previous case, the last inequality follows from the facts that the coordinates of r x 
have smaller magnitudes than the non-zero coordinates of v x and m < M/2. For C 3 3 > 2 this in 
particular implies that 1111 2 — 11’’re 11 2 ■ Therefore from (3.17) we further deduce that 


(3.18) 


^#112 


>p/V2 


and 


^112 


<4C 3 ' 3 (K + R)y/p. 


Assume that supp(u x ) C /, supp(ua;) C J, for some I, J C [n]. Choose u G A/}, v G A fj, l G A/o 
such that 


(3.19) 


\U X - U \\ 2 < £, 


— w 


J x\\2 


< t, and 1 1 — \\v x \\ 2 | < e. 


and consider x = u + Iw G M. For e < p/y/ 2, we also have that 

(3.20) \VxW 2 < 2 (C 3 ' 3 (K + R ))~ 4 ||u x || 2 < 2(C 3 ' 3 (K + R))~\l + e) < 4 {C 32 (K + R))~H. 

Thus we see from (3.19)-(3.20) that x approximates x. Now using Corollary 3.7, we would have 
liked to show that for any x G Ad, an inequality similar to (3.15) hold. However, such inequality 
is not possible for any x G Ad, as the conditions required for Corollary 3.7 does not hold for all 
x G Ad. We solve this issue by modifying the net Ad. We construct the modification Ad' C W 
as follows: If for an a: G Ad, there exists an x G W such that (3.19) holds, then we keep that x 
in Ad' (if there is more than one choice we choose any one of them arbitrarily). Note that this 
construction ensures that |Ad'| < |Ad|, and moreover by the triangle inequality it follows that, for 
any x G W, there exists x G Ad' such that 


(3.21) 


Uqp Uj' 


x \\2 — 2s, 


J x ||2 


< 2 r, and | 1111 2 ~ ||t>x || 2 I — 2 e- 


Proceeding analogous to (3.20), we also deduce that 
(3.22) 


r*|| 2 < 2(C 3 3 (iL + R))- 4 11 v x 11 2 < 2(C 3 3 (K + R))-\\\v x || 2 + 2e) < 6 (C 3 _ 3 (K + i?))" 4 \\v x || 2 , 


-4/ 


\—4 1 
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Now fix x € M'. Then, using (3.18), and proceeding as in (3.14)-(3.15), from Corollary 3.7, we 
deduce that 

C({A n + D n )x, (C 3 3 (AI + R)Y i y /pn ||u s || 2 ) 

< C(A n Vx, {C^ 2,{K + R))~' 3 ^pn H^iy < e~ c ' n . 
Assume that the parameters e, r > 0 are chosen so that 


n 


n — m\ 1 /3\^ /3\^ 


m) \M — mj e \£J 




r - C 


c'n/2 


(3.23) \M'\ < \M\ < 

Then by the union bound, 

P^3 x € M' such that || (A n + Z? n )x || 2 < (C 3 ^(K + R))~ 3 ^/pn \\vx || 2 

(3.24) < exp(—(c'/2)n). 

We now extend the uniform lower bound in (3.24) for all x € W. In the process of this extension, 
we select the parameters e and r. Finally, we will check that this selection satisfies (3.23). 

Assume that the complement of the set appearing in the LHS of (3.24) occurs. Now we recall 
that for every x € W, there exists an x G M' such that (3.21) holds. Therefore 

|| (Ai + -D n )x || 2 > || (A n + H n )a :|| 2 

(3.25) (11 A n || T || D n || ) ^ || U x Ux || 2 T II Vx Vx II 2 T || Dr || 2 T ||^a;|l 2 
To obtain a lower bound on the RHS of (3.25), we use (3.21) to note that 


\Vx ^xllo i; 


Vx 


J x 112 


I Vx II o + \\Vx 


1 - 


Vx 


x \\2 


u x \\2 


< 2 (e + r ||us|| 2 ). 


Further using (3.20) and (3.22) we also obtain that 

IM 2 + INI 2 < 8 (C 3 3 (K + A )) -4 \\vx \\ 2 . 

Denoting p' = (C 3 3 (A' + A))” 3 , applying the previous two estimates, and (3.21), from (3.25), we 
therefore deduce that 


(3.26) || {A n + D n )x || 2 > p! ||u 5 || 2 y/pn-2{K+R)y/pn- (s+\\vx\\ 2 t+£+4(C 3 ^(K+R)) 4 ||n s || 2 
Setting 

(3.27) r = , £ = 


) 


li'p 


16 (K + R) ’ 16(iF + R) ’ 

enlarging C 3 3 further, if necessary, and recalling the fact that ||ui || 2 > p/V 2 , from the inequality 

(3.26) we further deduce that 

fj n' 

\\(A n + Dn)x\\ 2 > — \\vx\\ 2 y/pn> -^=p^pn. 

It thus remains to check that (3.23) holds for the choice of parameters in (3.27). To this end, recall 
that m = (8 p)~ l and M < cn. Substituting this in (3.23), we obtain 


n 


n — m 


ml \M — m 


< 


n 


m \M 


n 
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Therefore (3.23) yields 

IM'I < \M\ < f 48e(y + B) \'”/ 384(A- + fl)pn y») 

V W J V p'p J 

Thus, we have to show that 

48e(/v + R) \ cn / 384(77 + R)pn 

cp' J V P'p 

To this end, we claim that 

f 384:(K + R)pn \ {8prl / 48 e(K + R) \ cn 

V P'P ) ~ \ CP' ) ' 

from which it is easy to see that the bound in (3.28) follows if c is chosen small enough with respect 
to c! . Turning to prove our claim, we note that it is enough to prove that 

-i , (P n \ 

p log 1 — 1 <C n, 

which is immediate since np —>• oo, and <C np. This shows that (3.23) holds and thus the proof 
is completed. □ 

Finally we are ready to prove Proposition 3.1. 

Proof of Proposition 3.1. Since Sparse(M) n S n ~ l C Dom(M, (C 3 g (K + i?)) -4 ), with the help of 
Lemma 3.8, the proof is completed using the same arguments as in the proof of Lemma 3.4. The 
details are omitted. □ 


(3.28) 



Remark 3.9. In the proof of Theorem 1.1 we will also need a modification of Proposition 3.1, 
where the matrix under consideration is not a n x n matrix, but a [n — 1) x n matrix. One can 
check that if some modified versions of Lemma 3.2 and Corollary 3.7, applicable to (n — 1) x n 
matrices are available, then the rest of the proof remains exactly same. Moreover, for (n — 1) x n 
matrices one can easily reprove Lemma 3.2 and Corollary 3.7 with slightly worse bounds. Therefore 
the proof of the required modification of Proposition 3.1 is straightforward, and hence all the details 
are omitted. 


Remark 3.10. In Proposition 3.1 we computed a probability bound for the infimum of 
||(j4 n + -D n )x || 2 over dominated and compressible vectors x € M n . This treatment of the infimum 
for general real-valued diagonal matrix D n such that ||D n || < Ry/np for some finite positive R, is 
motivated by the analysis of the limiting spectral distribution of A n . It is well known that a key 
step to such analysis is the control on s m i n (^4n — ujy/rvpl n ) for ui € B(0, R) C C, for some R finite 
(see [7]). 

It can be easily checked the proof of Lemma 3.3 and Lemma 3.4 remains same when we allow 
D n to be complex-valued diagonal matrix, and the infimum is now taken over compressible and 
dominated vectors in C n . Corollary 3.7 also continues to hold for vectors in C n . However, the proof 
of Lemma 3.8 uses some estimates of 7 -net in ML Therefore those steps need some modifications. 
To this end, note that (3.16) becomes 



2 cn 
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and (3.23) becomes 


\M\< 


n 


n — m 


ml \M — m 


!) 


2 |/| 


?) 


2\J\ 


and rest of the estimates remains same. Shrinking the constant c, if necessary, repeating the same 
steps one can deduce the conclusion of Lemma 3.8, may be with a slightly worse constants. Building 
on this one can then extend the result of Proposition 3.1, where D n is now comp lex-valued diagonal 
matrix and the infimum is taken over complex vectors. 

To obtain the necessary bound on s m i n (A n — u>y/npl n ) for ui € B(0,R) C C, we also need an 
modified version of Proposition 4.1 for complex vectors. However, this is not a straightforward 
extension from the real case. See also Remark 4.5. 


4. Vectors with a small LCD 

Bounding the smallest singular value of a random matrix A n depends crucially on a strong esti¬ 
mate of the Levy concentration function of A n x for x € S” 1-1 . Such estimate, however is impossible 
to achieve for a vector having a rigid arithmetic structure. As such structure is measured by the 
LCD (recall Definition 2.6), we have to treat the vectors with a small LCD separately. Fortunately, 
the set of vectors with a smaller LCD has a smaller complexity, i.e. a smaller e-net size. We en¬ 
counter two opposite effects: a larger LCD means a better Levy concentration function bound, and 
at the same time, a larger complexity of the set. We show below that these two effects compensate 
each other precisely. To this end, we partition the set of vectors with a small LCD into the sets Sl 
for which the LCD roughly equals L. Since the LCD is roughly constant in Sl, we obtain a uniform 
bound on the Levy concentration function, and thereby using an e-net we show that the infimum 
over Sl is well controlled. 

Since we have already obtained a lower bound on the infimum over compressible and dominated 
vectors in Proposition 3.1, we will consider vectors which are neither compressible nor dominated. 
For p _1 < M < eg in, and p as in Proposition 3.1, define 

W := {x <E 5 n " 1 | x i Comp (M,p) U Dom(M, (C 31 (K + i?)) -4 )}. 

Next for v € M”, let I(v) := Supp(v[jif+i ;n ]) be the set of small coordinates, and let vj^ = v\m+i: n]- 
Recall that for x € 5 n_1 , its LCD is defined as 

D(x) := inf j(9 > 0 : dist( 6 H,Z n ) < (bop)^ 1 ^ 2 \J \og + (\/So p0) j, 

where <5o € ( 0 , 1 ) is chosen as in Remark 2.7. As mentioned above we need to define level sets Sl- 
Since the diagonal entries of A n are zero, we need to work with the following modified definition 
of level sets. For any L > 1, we define 

Sl ■■= jv € W | L < mf D(v I(v) \ {i} / ||uj(„)\{i}|| 2 ) < 2 lJ . 

We are now ready to state the main result of this section. 

Proposition 4.1. Fix K,R > 1, and let D n be a non-random diagonal matrix with real entries 
such that \\D n \\ < R^rvp. Let An' m be the mxn matrix obtained from (A n + D n ) J by collecting its 
last m rows, where A n is the matrix defined in Theorem 1.1. When D n = 0, we write A™ instead 
of An • Fix a positive real r > 1. Then there exist small positive constants C 4 \ ,d^ ^,04 and a 
large positive constant C 4 4 , depending only on , and small positive constants C 4 4 . -p 4 , 
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depending on E ff- and r, such that, if r > (C '4 ^iL+A)) 2 , i/ien forrp 1//2 < L < exp(c^ ^pn/(K+ 
R) 2 ), m > n — 4 (17 + R) 2 /p, we have 


inf 

- 


A°’ m i 


< C 4 ipeoy/pn and 


A m 

■™-n 


< Ky/pn\ < exp(— C 4 j_n), 


where 


e 0 = min^j/Vr, c ' 4 -yy/n/L). 


Similar to Section 3 a crucial tool here would be bounds on Levy concentration function (recall 
Definition 2.3). However, the estimate obtained in Corollary 3.7 is not sufficient for incompressible 
vectors. To this end, we find estimates in terms of the LCD, see Definition 2.6. For <5o as in Remark 
2.7, from [32, Theorem 6.3] we get the following result: 


Proposition 4.2. Let S € M n he a random vector with i.i.d. coordinates of the form Sj = Sjfj, 
where P(<5j = 1 ) = p, and fj’s are random variables with unit variance, and finite fourth moment, 
which are independent of 5j. Then for any v € S ” 1-1 

(4T + 

for some constant C '4 2 , depending only on E|^| and E£|. 

Let I C [n], and for any v € M n , let vj € M n be the vector with coordinates vj(j) = v (j ) • I(j £ I). 
Since the diagonal entries of A n are zero, depending on the value of m, for every i £ [m], there 
exists a j € [n] such that (A™)ij = 0. Thus applying Proposition 4.2 we deduce that 

£ U^n' w)i, ||wj\{j }|| 2 y/fc) < C [y%Vi)i, ||wj\{j }|| 2 y/pe) 

~ ° 4 ' 2 ( e+ Vp d ( v i\{a/ \\ v i\iM 

~ ° 4 ' 2 ^ + VP inf Mn] D( Vl \ {j} / ||u A{j }|| 2 ) 

Now a direct application of [32, Remark 3.5] gives the following result on tensorization, which allows 
to transfer the bound on Levy concentration function from random variables to random vector: 



Proposition 4.3. Let Afff be the matrix defined in Proposition 4-1- Then for any e > 0, and any 
I C [n] we have, 


(4.2) C(A™v,e inf ||vj\ { j } || 2 y/pm) < C ^ 3 e + ——- — - -j, -n— 

where C '4 3 is some constant, depending only on E|£jj| and E 


Setting the parameter L = (<5q p) 1 / 2 in [32, Dehnition 6.1], we note that the definition of LCD there 
matches our definition of LCD. Therefore from [32, Lemma 6.2], we immediately obtain: 


Proposition 4.4. Let x £ S n 1 . 


Then 


D(x) > 


2 x 
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We now proceed to the proof of Proposition 4.1. 

Proof of Proposition f.l. The proof relies on a covering argument. The lower bound for the LCD 
is used to obtain the uniform estimate for the Levy concentration function. Then we construct a 
special eo-net of a small cardinality, and extend the Levy concentration function estimate from one 
point to the whole net by the union bound. Finally, we use approximation to extend this bound to 
the set Sl- 


Step 1. Recall 

rp ” 1 / 2 < L < exp(c / | ^pn/(K + R) 2 ) and eo = min(c 4 j/-\/r, -^y/n/L). 

Since L^/p > r, and np — > oo, we have that Eq > Thus, for v G Sl, by (4.2), we immediately 
obtain that 

C(A®' m v, inf 2 cE 0 ^prn) < £(A™v, ini \\v^ v) \ {j} \\ c£ 0 ^pm) < e™, 

j£[n] * ie[n] 

where c = ( 2 C 4 g) -1 . 


Step 2. To make the approximation possible, we have to approximate the large and the small 
coordinates of v differently. Since v G Sl, for some j G [n], we have -D( u i(u)\{j}/ || u /(v)\{j} || 2 ) — 2-L. 
For this j G [n], a scaled copy of the vector V r( v )\{j} is close to an integer point. We will use a 
scaled copy of this point to approximate V r(v)\{j} ■ We do not have any information about the vector 
V i c (v)u{j} besides 11 ‘ L 7 c j (?j)u{y} || 2 < 1- Therefore, this vector will be approximated in the £2 norm 
using the standard volumetric estimate. 

Now, we pass to the details of this construction. To this end, fixing I C [n] a set of cardinality 
n — r 2 ^ 1 , we denote 


Zi := {z € Z n | supp(z) C / and 0 < ||z|| 2 < 2L}, 
and let J\fj := {^/|| 2|| 2 | z € Zj}. A simple volumetric calculation shows that 

9 _1 

I A/} | < (2 + ^= 


for a universal constant c. Also, there exists a (ceop/10(K + R ))~net (the constant c is the constant 
obtained in Step 1) A fj in {x G Bf \ supp(x) C I c } of cardinality 


|AAi| < ( 


/ 30(A' + R) 


C£ 0 p 


r 2 p 1 


Let A/q be a (c£op/10(K + R))-net in [p/2, 1] of cardinality 


Set 


|A/o|< 


30(iL + R) 

CE 0 p 


A4 (1) := [J {x + ty \ x e AT/, y € A/}, t G A/o}. 

/C[n] 

\I\=n-r 2 p - 1 


This set M 1 ' 1 ' 1 does not quite serve as an appropriate e-net of Sl, because we also need to consider 
those v G Sl for which j G I(v). For such v, the cardinality of I(v)\{j} is n — r 2 p -1 — 1. Thus we 
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need a modification of the set AT 1 '. Namely, we denote 

M {2) := [J {x + iy | x € Mi, y G A/}, t eATo}, 

^C[n] 

|J|=n— r 2 p 1 — 1 

where the estimates on the cardinality of A/} and Mr now changes to 


_ r \ n—r 2 p 1 — 1 

IMI <(2 + E= 


and 


Wrl < ( 


/30(AT +A) 


CE 0 p 


Set Mi := A+ 1 ) U Ml ^. Therefore the previous estimates now yield 


\M\ < |AT (1) | + \M {2) \ < 2c 


n 
. 2 „-l 


/ 30(A' + R) 


(4.3) 


< 2c 


r 2 p 1 + 1 

'C(K + R)n 
r 2 p~ 1 eop 


y 


£0P 
cL 


r 2 p~ 1 +2 


cL 
2 + — 
n 


r 2 p X + 1 


9 — 1 

n—rp 


2 + 


_ 1 \ r 2 p-Ml 


/30 (K + R) 




£o P 


cL \ n+1 


2 + — 
n J 


where C is some absolute constant. Recall that p = (C 3 ]+) _ ^ 0_6 (see Proposition 3.1), where £q 
is defined in (3.5). Thus log(l/p) <C np, and therefore we can choose a constant ci arbitrarily small 
such that /A 1 < exp+np) for all large n. Hence the third term in the RHS of (4.3) is bounded 
above by A exp(cipn), where c\ is another arbitrarily small positive finite constant. Similarly, we 
conclude that 


C(K + R)n 


r 2 p 


r 2 p J +1 


< exp( 2 r 2 cin). 


Next, from the upper bound of L, and the definition of £q it follows that 

np ^ x np \fnpL exp(2 c\ A pn) 




< (4i )' 


#(2 + %) 4 


1 


-4.1 


The last inequality follows from the assumption (1.5), and the upper bound on L. Therefore 
combining all the estimates we get 


(4.4) 


\M\< 


exp( 2 r 2 cin + c\pn) exp(4 r 2 c'^ ^ n 


2 + 


cL 


n+1 


Eoi ^) 2 ^- 1 C n 

Now we will show that Mi serves as an appropriate e-net for Sl- To this end, let v € Sl- Then 
there exists j € [n] such that -D( v z(u)\{j}/ || u /(u)\{j}|| 2 ) < 2L. Let us assume that j € I(v), and 
write v = V i( v )\{j} + Ar c (?;)u{j}• We claim that there exists v' = x + ty € Mi^ such that 


I -II / c P e 0 

I WJC (’’) U{J '} X " 2 ~IU(K + Ry 


v Rv)\{j} 


I u R«)\{j}II 2 


< 


2v / l°g(v / ^aP • 2L) 


\/%pL 


(4.5) 


CpEo 


«°d|t-|K MW> |U< 10(A . + Ji) . 


Indeed, choose x € -W/(„)\{j} suc ^ that H^RDuR} — ^11 2 < io(K+R.) ~ ®y the definition of the LCD, 
we can find z € such that 


v Rv)\{j} 


IK 


— z 


v)\{j} II 2 


< 


V / log(V^+g) 

V%oP 


2 
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Since L < D{v I ^\^/ ||' y /(jj)\{j}|| 9 ) < 2L, we have L <6 < 2L, which implies 

V I(v)\{j} 


II v i{ 




< 


y/log(V^p 2 L) 

\/%pL 


Thus setting y = zj || 2 || 2 € A l’i( v )\{j} we obtain 


11,711 


v Hv)\{ j} 

imi 2 

< 


ITII 2 g 

2 

IK(«)\{jiII 2 

e 

2 

2 

IKoybi II 2 ^ 


and therefore 


v Hv)\{j} 


IK 


v)\{j} I 


y 


< 2 


v Hv)\{j} 


N 0 YB}ll 2 


< 


2y / log(\/^op2L) 

V$opL 


Finally, noting that Sl C (Comp(M, p)) c , we have that for any j 6 [n] 

(46) IKo\B}ll 2 ^ IKoll 2 " IKolL - IK«)II 2 IK«)II 2 ^ f > 

where the second last step follows from upon choosing r sufficiently large. Therefore we can choose 
t € A/o so that |*~ |h (t ,)|| 2 | < 10 g+R) - 

In the proof of (4.5) we have assumed that j € /(u). One can repeat the same proof above 
even when j ^ I(u), to conclude that in this case, there exists v € AT 1 ' such that such that (4.5) 
still holds. Hence, combining these two arguments we obtain that for every v € Sl, there exists a 
v € M such that (4.5) holds. 

The deficiency of this construction is that A4 (jL Sl, so we cannot use the small ball estimates 
we obtained for the points of Sl in Step 1. This however, can be easily corrected. For any point 
v' = x + ty € AT choose one point v € Sl satisfying (4.5), whenever it exists. If such a point does 
not exist, we skip the point v'. These points v form a set AT, which can be used instead of AT 
Indeed, the triangle inequality implies that for any w € Sl, there exists v = x + ty € AT, and 
j S [n], such that 


I -\\ ^ c P e ° 

\w nw)u{j} -x\\ 2 < ^ K + R y 




< 


4 v / log(\/5oP • 2L) 


(4.7) 

Obviously, |AT| < |A4|. 


Wj(u>)\{j }|| 2 2 v^o pL 

and |t - |K(„)\{j>|| 2 | < 


Step 3. By Step 1, for any v € AT, 

£(A°’ m v, inf \\v I{v) \ {j} \\ ceoy/pm) < eJJ*. 

3&m 

Now from (4.6), we also have that, 

Ki IKo\B}ll 2 — b IKv)\{j *}|| 2 ’ ^ e N- 

je[n] z 

Hence absorbing the factor 1/2 in c, we deduce the following estimate on the Levy concentration 
function: 

£{An’ m v, ||^j(u)\{j }|| 2 csoy/prn) < e™, Vj € [n],Vu € AT. 
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A 


D.m/- 


Thus by the union bound, 

P^3u = x + ty € M.' such that 
(4.8) 

Assume first that < 1. Using (4.4), we see that the RHS of (4.8) is bounded by 


(x + ty) < csoy/pn ■ t ) < |A4| max £(A„’ m n, tceoJpn) 

2 / i+tj/SAT' 

< IKK 


(4.9) 


1 / 

— exp — n 

£o V 


m 


1 1 2r 2 1 

— log-log 6 - log-log -4r 2 (ci + ) 

n e n c np -y 


Shrinking, if necessary, the constants C 4 and 4 , we have 4 log(l/eo) > log( 6 /c). Choose c ' 4 ^ 
and ci small enough such that 4r 2 (ci + C 4 4 ) < ^ log(l/eo)- Finally, noting that np 00 , and 
m/n > 1/2, we deduce that (4.9) is bounded by exp(—c'n), for some small positive constant c'. 

Otherwise, if > 1, using the facts that ( n—m)/n < c 4 ^ (K+R) 2 /(pn) and L < exp(c^ ^ pn/(K+ 
R ) 2 ), and choosing c^ ^ sufficiently small, and shrinking 4 , if necessary, we deduce that the right 
hand side of (4.8), is bounded by 

/ 3cL x n+l 


V V n 


c^yjn 


L 


= exp — (n + 1 ) • 


, 1 n + 1 — m 

log —7 -—— log 

Jcc^ ^ n + l 


L 


c^ -p/n 


(4.10) 


< exp —n 


1 


l0 ^ 3hh c 4.1 c 4.1 


^ log + 


c 4.1 P 


< exp (—c w n), 


l - 41 np c ' 41 (/\+R)^ 

for another small positive constant c". 

Therefore we have obtained that 

P (dx + ty£M' A^ ,m (x + ty) > c£oy/pn ■ t) > l — exp(—c"n), 

where c" = minis', c"}. Now we restrict ourselves on this set with very large probability. Consider 
any w € Sl- By our construction of M! there exists x + ty € AT, and j € [n], such that it satisfies 
(4.7). Therefore, on this set of large probability we have that 

A°’ m w 


> 


A Z’ m (x + ty) 


-OKI + \\D„ 


W l c {w)u{j} x \\2 + W I{w)\{j} || ^T(ui)\{ 1 } || 2 ^ 


+ 


KHYUHI2 


— t 


> c£ot^/pn — (K + R)y/pn ■ 


2 cp£ 0 


4y / log(\/^p- 2L) M , 
+ --||W/( W )| 


5 (K + R) \fSiypL 

Recall that for w £ Sl C W, we have || u, /(iD)\{j} || 9 > p/2. This and (4.7) imply 


2 p M |, 

^ 5 " — II 2 _ 


cp£ 0 


5 (K + R) 5 


2 p 1 „ „ 

-> — llwj(«,)\{j}ll 2 


20 
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Combining this with the previous inequality, we obtain 


A°’ m w 2 > VP™\\ w i(w)\{j}\\ 2 “ ( K + R ) 


4y / log( % /%j • 2L) 

VfioP L 


If c aj/ k 


2 ■— > ~^jr, then eo = Iu this case using the fact that L > rp 1 / 2 , and observing that 

for positive constants aq, 0 : 2 , the function x i-a is a decreasing function for large values of 

x , we obtain 

--(K + R) 4 V 1 °g(\ / ^oP ' 2L ) cc 4.1 , K , i > ) 4 \/ 1 og( 2 \^o r ) 

20 1 J - 20 ^ 1 J 

Now choosing r > (C 4 \ {K + R )) 2 , for a sufficiently large constant C 4 f we obtain 

A n w 2 > IKh\{1 }|| 2 • 4^ ^ go ■ 

C ' C/| 1 C /| 1 

Otherwise, when — < -^=L, we have £o — —• Since L < exp(c^ ^ pn/(K + R) 2 ), choosing 

c '4 2 sufficiently small, we have 

ce 0 


_ Vto^C 2 l) = 
20 1 7 


_ 4(K + i?)Vlog(\/^p2L)^ 

20 


eoV^opL 




~ £o \20~ 


c 4 (if + A) v / log(v / 3op2L) \ 


> £o 


c 

20 


“ £ ° 20 ~ 


C 4 ' 

4(-K" + R)^log(y%p2exp(c^ 1 pn/(K + i?.) 2 )) 

C 4 i\AW? 

4(A' + A)y / log(V^op2) + c^pn/CftT + A) 2 ) 


c' 41 V<ionp 


> eo 


c 4 V c 4.1 x 


> txEo- 


20 “ 40 


Therefore, in this case 


A®' m w 


> yjpn • || wj (w )\ {i }|| 2 • ^ e 0 > — Evpjwi . 
Thus combining both the cases, and setting c 4 \ = the proof is completed. 


□ 


Remark 4.5. Proof of Proposition 4.1 crucially uses e-net argument. If we allow D n to be a 
complex-valued diagonal matrix in Proposition 4.1, then the sets Sl become subsets of the complex 
unit sphere, whose real dimension is 2n — 1 instead of n — 1. hence, (4.3) changes to 




2 (r 2 p x +2) 


2 + 


cL 


2(n—r 2 p x ) 
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The reader can easily convince her/himself that is not exponentially small anymore. Thus 

the proof breaks down in the complex case. Since the extension of Proposition 4.1 to complex-valued 
D n is quite involved, we defer it to [4] where we use it to derive the circular law. 

5. Proof of Theorem 1.1 

In this section we combine the results from Section 3, and Section 4 to prove Theorem 1.1. 

Proof of Theorem 1.1. Recalling that Q k = {||A n || < K^/np}, we note that for any $ > 0, 

lP^{ s min(^4n + D n ) < I?} H O.k'J 

(5.1) - P ({ || (^ + D n )x || 2 < n + p({ inf || (A n + D n )x\\ 2 < PI Ur) , 

where 

V := S ,n " 1 \(Comp(c 3 qn, p) U Dom(c 3 qn, (C 3 q (A' + il)) -4 )), 
and p as in Proposition 3.1. Using Proposition 3.1 with M = c 3 ]_n, we obtain that 

p( hif J|(A n + D n )x\\ 2 < C^ i(K + R)pyfnp, ||A n || <Ky/pnj < exp(-c 3 - 1 np). 

Therefore it only remains to find an upper bound on the second term in the RHS of (5.1). Now 
using Lemma 2.4, we see that to find an upper bound of 

p({ mf ||(l n + D n )x\\ 2 < e P 2 y^} n 

is enough to find the same for 

P( jdist (A n j, H n j) < py/pe^ n Vt K j for a fixed j, 

where A n j are now columns of (A n + D n ) (see also Remark 2.5). As these estimates are the 
same for different j, so we consider only j = 1. Let A® be the (n — 1) x n matrix whose rows 
are the columns A n 2 ,---,A nn . Note that it is the matrix An’ m defined in Proposition 4.1, for 
m = n — 1. For ease of writing, hereafter we omit the superscript m. Let v € 5 n_1 n Ker(A(/), 
where Ker(A„ ) := {x € W l \A®x = 0 }. Since 

dist(A n ,i,iLn,i) > \(v, A n ,i)|, 

it is enough to prove that 

P^jdu € S 71 ^ 1 such that Afv = 0 and |(A ni i,u)| < p£y/p^ n FIr^J < e + exp(— cpn/(K + R ) 2 ), 

for some positive constant c. We partition S '” -1 into the set of compressible and dominated vectors 
and its complement again. Setting Q = (2C 4 \ {K + R)) l2 p~ l , denote 

W = S n " 1 \ (Comp(Q, p ) U Dom(Q, (C 4A (K + R))” 4 )). 

Then 

P^jzb € S ’ 1 ” 1 such that A^v = 0 and \(A n4 ,v) \ < p£y/p^ 

< P^|du € W° such that A^v = o| n flx'j 

+ P^|du 6 W such that A®v = 0 and \{A n4 ,v)\ < j n PIr^J • 


(5.2) 
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This time, we apply Proposition 3.1 with M = Q. It yields that the first term in the RHS of (5.2) 
does not exceed exp(— eg \np). Although this proposition was proved for n x n matrices, the same 
proof would work for (n — 1) x n matrices as well (see also Remark 3.9). 

Let w € W. Since w ^ Dom(Q, (C 4 \ (K + R))~ 4 ), 

|h7 M || 2 >(C 4 . 1 (iL + R))- 4 V ^||u; /H ||^>4(C4 J (iL + R))V 1 / 2 |h/H|| 00 . 

We also recall that, for p < c(K + R)~ 2 , and a sufficiently small c (see (3.14)), we have 

||wj(« 0 \{i }|| 2 > \ \\ w i(w)\\ 2 , for i € [n]. 

Hence, Proposition 4.4 yields 

inf D ( > (c 1 (. K + R)j 2 p~ 1/2 for all w € W. 

teW V|Km\w|| 2 / 

To estimate the second term in (5.2), decompose W as W = W\ U W 2 , where 

W 1 :=\weW\ inf d( Wi ( w ™ ) < exp^j , pn/{K + R) 2 )} and W 2 :=W\W!. 

1 ie[n] V|p/(^)\{i}|| 2 / J 

Decompose W\ further as 

★ 

(5.3) W\ = (J S L , 

(C 4 J (R+R)) 2 p- 1/2 <L< exp ( C | lP n/(K+R)2) 

where the a denotes that the union is taken over L = 2 k for k € N. Then by Proposition 4.1, 

€ W\ such that A®v = o| 

-k 

< ^2 € S l such that A^v = 0 j P| 

(C 4 _i(A-+R)) 2 p- 1/2 <R< exp (c" lP n/(K+R)*) 

< ( k '+ R ) 2 ' exp ^ _ 2 4 . 1 n ) < ex P (-^ y -^)- 

Thus, to finish the proof of Theorem 1.1, it is enough to estimate 

f(Bv € W 2 such that A®v = 0 and |(A nj i,u)| < ep^/p'j 

Note that v is defined by A Ui2 ,..., An in , so it is independent of A n _Condition on A n> 2 ,..., A n>n 
such that v € W 2 for the matrix A® formed by these columns. We may now consider v as a fixed 
vector satisfying 

inf d( \ > exp(c'{ ,pn/{K + R) 2 ). 

leN V||W(D\D}|| 2 / 

Let a22^ 1 ^ be vector obtained from A Uj 1 by keeping the coordinates corresponding to the set 
7(u)\{l}. Since v ^ Comp (M,p), we have || u /(u)\{i} || 2 > p/ 2- Thus using Proposition 4.2 we 
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obtain that 


P(|(>l« 1 i,«)| ^ £ PVP ) < supP( (A* n [ i )N{ 1 } ,t 7 j-( w) \ {1} ) -y 

y£ R v 


< £PVP 


< 2 C 4 2 £ + 


1 


< 2 C 4 2 ( e + 


VP D ( V I(v)\{l}/ bl(v)\{l}\\ 2 ) 

1 2 


Vp 


■ ex 


p(-c 4 .iW(^ + i?) ) < 2C 4 2 e + exp - 


C 4 

2 (A" + A ) 2 


where the last inequality here follows from the assumption p > . Replacing e by e/(2C 4 2)1 we 

obtain 

c'lfpra \ 


P(l(Ai,i,'y)| < ( 2 C 4 . 2 ) Wp) < e + exp - 


2(K + A ) 2 


which completes the proof of the theorem. 

6. Estimates of the spectral norm 


□ 


In this section we prove bounds on the spectral norm of sparse random matrices with heavy-tailed 
entries (Theorem 1.4) and with sub-Gaussian entries (Theorem 1.7). Building on those theorems 
we complete the proof of Corollary 1.5 and Corollary 1.8. We then provide an outline of the proof 
for the spectral norm of sparse random matrices, with entries satisfying (1.9), in Remark 6.3. 

To prove Theorem 1.4 we use the following result of Seginer [26]. 

Theorem 6.1. (Seginer) Let A n be a random matrix with i.i.d. centered entries whose columns 
are denoted by A n> 1 ,... ,A n ^ n . Then, there exists an absolute constant Cg 4 , such that for 1 < q < 
2 log n 

E\\A n \\ q <C q 6A E max \\A j>n \\ q . 


Proof of Theorem l.f. Fix t > 1. Then Markov’s inequality yields 


\aij\ > ty/np) < 


E|a, 


v 1 


< 


/ k y 

■p- 


{ty/npY \ty/np J 

Hence, using the fact that p = Ll(n~ a ), and using the definition of q, we have 

P (3i,j £ [n] such that |a^l > t^/np) < n 2 ■ ( 


( 6 . 1 ) 


/ k y . f K\ q 
1 p<c - , 


\ty/np ) 


t J 


for some constant C' , depending only on a. Now setting yij = aij ■ 1 (|| < tyfvp ), we define the 
random variables 


' ij ‘ V 2 ty^p) 


— E 


( vv v 

. V 2 ty/np) 


Upon observing q > 4, we note that 

E Zij = 0, E zfj < 


1 V 


2 ty/np ^ 


Bp, and | z t] \ < 1 a.s., 
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for some constant B , depending only on the fourth moment of Then by Bennett’s inequality, 

for any s > 1, 


i>« a s p (e (wh) - " E [ (2^) 


Z =1 


\2ty/npJ 
p | Zij > s 


> s* 


\i=l 


< exp 


--log 

8 6 


2 nEzfj 


—1 „ 2 j . 4 „_ \-« 2 /8 


< (8 B 1 s 2 t 4 np) 


Recalling that p = f2 (to “), we obtain that for some positive constants c and C, depending on a, 
and the fourth moment of {£jj}, 

P € [n] such that > 8 s 2 t 2 np^ < n (8 B~ 1 s 2 t 4 np) b ^ 8 

< n -(l-«)^ 2 /8+l < n ~cs 2 

for all s > C. Therefore, 

P S [n] such that t a\ rj > 8 s 2 t 2 np^ 

2 2 / K\ q 

< n cs + P (3i,j € [n] such that a %J / yij) < n cs + C' f — j . 


Let r < q, and choose f3 > 0 so that j^ > r. Setting s = t / 3 ^ 1+i3 \ t = we conclude that 

for any r > 

P (3j € [n] such that ||^4j,n || 2 > 2 Ty/np) < n~ CT P/< ~ +£ ° + C' K q r~ q ^ 1+l3 \ 

Upon using integration by parts from the last inequality we deduce Emax^u IIA 7 II 2 — C(y/np) r , 
for some C depending on a,K,r, and the fourth moment of {£iy}, which in combination with 
Seginer’s theorem proves part (i). 


To prove (ii), take p € (r, q), and let £ be a symmetric random variable such that P(|£| > t) ~ t~ p 
as t —>• 00 . Let , i,j 6 [n] be independent copies of £. 

By Chernoff’s inequality, for any c € (0,1) there exists positive constants d , c" such that 


E 

v*.3 =1 


5ij < cn“ 


P 


< exp (—c'n 2 p) < exp(— c"n 2 “). 


Now conditioning on the event E that Y^ij=i ^ 


P max \(iii I < t^Prvp 


E 


<(1 -C{ty/Ep)- p ) cpn < exp (—C(ty/np) 


> cn 2 p, we see that for a sufficiently large t, 
~ P cpn 2 ) < exp ^c't- p n- p ^- a)/2+2 - a ^j , 
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where C' = Cc. Removing the conditioning, we obtain 
IP (11 A n 11 < ty/rvp) < P( max \a,ij\ < ty/np) 

i,j£[n] 

< exp ^-C't-Pn- p ^- a ^ 2+2 - a ^ +exp(-c"n 2 - a ), 

Setting t = n u for some v > 0 such that 

P (v + < 2 - « 

completes the proof. □ 

Since in Theorem 1.1 we consider matrix with i.i.d. off-diagonal entries, and zero diagonal entries, 
we cannot directly apply Theorem 1.4 to prove Corollary 1.5. If we are able to show that ||A n || = 
0(y/np), with large probability, then conditioning on A n we can complete the proof of Corollary 1.5. 
To prove HA^H = O(yfnp), with large probability, we note that the operator norm of any diagonal 
matrix is the maximum of its entries. Thus the proof completes using Markov’s inequality, and the 
union bound (for example, one can proceed as in ( 6 . 1 )). 

Proof of Theorem 1.7. Let £C, i : j g= |n] be independent copies of i,j € [n], and let A' n and 
B n be the matrices with entries o! V} = and bij = Sijrpj, respectively, where r/ ? ;y := — £L. 

Further let us denote by Eg the expectation with respect to £, conditioned on S : = (<5y)ije[n]• Let 
q > 1 be an even integer. By Jensen’s inequality, 

(6.2) Eg || A n \\ q = Eg \\A n - E^A' n \\ q < E ?? \\B n \\ q . 

Let gij, i,j € [n] be independent N(0, 1) random variables. Since is a sub-Gaussian random 

variable, there exists a constant C\ > 0, depending on the sub-Gaussian norm of {}, such that 
ILli/jjl 9 < ¥7\C\gij\ q for all q > 1. Let W n be the n x n random matrix with entries Wij = Sijgij. 
Since 

(6.3) E„ || B n \\ q < E f/ Tr ((B n B* n ) q ^ , 

where the last expression is a polynomial of the even moments of raj with non-negative coefficients, 
we have that 

(6.4) E^Tr ((B n B*) q / 2 ^) < C q E g Tr ((W n W*) q / 2 ) < C q • nE g \\W n \\ q . 

The last inequality above uses that Tr ({W n W*) q ^ 2 ) = Yl]=i^ 2 (WnW*), and all eigenvalues 
H w nW*) satisfy \\j(W n W*)\ < ||IT n || 2 . 

To estimate E ||VF|| 9 , we use the following recent result of Bandeira and van Handel [2]. 

Lemma 6.2 ([2, Theorem 3.1]). Let X be a nx m rectangular matrix with Xu = bijgij, where gij 
are i.i.d. A r (0,1). Let 


ai:=m r jZX', ^:= max 




h l-< 


0 * := max |6j ? | 

i,j 


Then 


5 
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First, let us denote 0 to be the event that for all i € [n], 

n n 

Sij < Cpn and 8ji < Cpn, 
i=i j=i 

for some C > 2 . Since p > Co^p 1 , by Chernoff’s inequality, and using the union bound, we see that 
we can choose the constant Co large enough, such that P(fi c ) < e~ cpn , for some c > 0, depending 
only on Co- Assuming that 8 £ and conditioning on S, using Lemma 6.2, we have 

E[||W n || \S]<2 (\j~Cpn + C* -y/Iogn^ < 

where C* is some absolute constant , and C' = 4(C*) 2 C. Conditionally on <S, ||W n || can be viewed 

2 

as a 1-Lipschitz function on M n " equipped with the standard Gaussian measure. Using the Gaussian 
concentration inequality (for example, see [14]), we obtain 

P[||W„|| >E[||W n || | 8} +t | 6} < Cexp(-c't 2 ) 

for some absolute constants C, d > 0, and any t > 0. Hence, 

POO 

Eg || W n \\ q < (C'pn) q / 2 + / qs^F [\\W n \\ > s \ d] ds 

J y/C’pn 

< ( C'pn ) q/2 + ( c"q ) q/2 , 

for some absolute constant C". Setting q = pn (or taking the closest even number), we see that 
this inequality in combination with (6.2), (6.3), and (6.4) yields 

Eg ||A n f n < n ■ {C 2 pn) pn / 2 < (C 2 V) pn / 2 , 

where we used the condition p > to absorb the factor n, and C 2 is a positive constant depending 
on Co and the sub-Gaussian norm of {£*■, }. Now if we choose C\ 7 > Cf, then Markov’s inequality 
implies that for any 8 G U, there exists a small positive constant c\ 7 , depending on C\ 7 , such 
that 

P[||A„|| > Cijy/pn | <5] < exp(-ci'jpn). 

Finally, shrinking 7 further we have 

P(||A n || > Ci jy/pn ) < maxP[||A n || > Cu^Jpri \ 5] +P(U C ) < exp(— c\ /fpn). 

This completes the proof. □ 


We have already seen that we cannot apply Theorem 1.1 and Theorem 1.7 directly to prove 
Corollary 1.8. We also need to show that ||A n || = 0(^/np), with large probability. This can be 
done very easily repeating steps in the proof of Theorem 1.7. We omit the details. Then proceeding 
as in the proof Corollary 1.5, we complete the proof of Corollary 1.8. 


Remark 6.3. One can extend the results of Theorem 1.7 for random variables satisfying (1.9). To 
this end, we will use the following result of Latala [12]. 

Lemma 6.4. Fix q > 1, and let {G}”=i be i.i.d. copies of a non-negative random variable f. Then 

l/s 


n 

£ 

i=1 


Ci 


sup 



: max ( 1 


n 


< s < q 










INVERTIBILITY OF SPARSE MATRICES 


39 


Fixing q = logn, for each j E [n], we apply the above result for Q = Thus denoting A n j 


to be the j-th column of A n we have 


1 "j'll2l \2q 




2—1 


< sup < Clog n 


np 

logn 


l/s 


s 2/i 1 : 1 < s < log 


n 


for some absolute constant C. Analyzing /(s) := 7 log + (2/3 — l)logs, for s E [l,logn], we 

note that 


ln jll2ll2 q 


< C log n max ■ 


np 


logn’ \logn 


np \ lo § 


(logn ) 2/5 1 > < eCnp, 


if np = ^((logn) 2 ^). Thus applying Seginer’s theorem now for q = 2 log n we get that 

(log n) 2/3 ' 


E || ||^ = nO(y/np) q , when p = Q 


n 


Finally applying Markov’s inequality we get that for every s > 0, there exists K := K(s ) such that 

IP(11 A n 11 > Ky/np) < n~ s . 


7. Proof of Theorem 1.11 


In this section we prove Theorem 1.11. Since the entries of the adjacency matrix of an Erdos- 
Reyni graph have non-zero mean, we first extend Theorem 1.1 to allow non-centered random 
variables. 


Theorem 7.1. Let A n be an n x n matrix with zero on the diagonal and has i.i.d. off-diagonal 
entries atj = Si,j£i,j, where 5ij, i,j E [n],i j , are independent Bernoulli random variables taking 
value 1 with probability p n E (0,1], and fij. i,j E [n\,i f j are i.i.d. random variables with unit 


variance, and finite fourth moment. Fix K > 1, and let LIk '■= \ — EA n < 


Ky/np^Y 


Further 


fix R > 1 and let D n be a real diagonal matrix with \ D n \ \ < Ryjnp n . Then there exist constants 
0 < cj i, c'j Cy i,Cj ^ < oo, depending on K,R, and the fourth moment of fij. such that for 
any £ > 0, and 


(7.1) 


P 



+ D n ) < Cj i£ exp 


Pn > 


C 7 i log n 
n 


(_ ^ log(l/ Pn) \ 

V C7X log (np n ) J 



< e + exp(-c / 71 np n ). 


The key ingredients in the proof of Theorem 1.1 are Proposition 3.1, and Proposition 4.1. Thus to 
prove Theorem 7.1, we need analogues of Proposition 3.1 and Proposition 4.1 for the non-centered 
case. To this end, we start with the following generalizations of those two results. Before stating 
these results, for the ease of writing, let us denote p := ECj, and let Uff to be the m x n matrix 
of all ones. Note in passing that \p\ is bounded in terms of the fourth moment of fij. Now we are 
ready to state the results. 

Proposition 7.2. Let A n , D n , K, and p be as in Theorem 7.1. Define An' m , A™, and Sl as 
in Proposition 4-1. Fix a vector y E M m , and r > 1. Then there exist small positive constants 
c 4 l , c 7 2’ ^7 an d a l ar 9 e positive constant C 7 2 , depending on the fourth moment of fij, K and 
R, and small positive constants c 7 2; c 7 2 ,c 7 2 ; depending on the fourth moment of tfij, K , R, 
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and also on r, such that, if r > (C 7 2 ^ + -R)) 2 , then for rp x / 2 < L < exp(c^ 9 pn/(K + R) 2 ), 
m> n — dj 2 (K + R) 2 /p, we have 



ppU™)v - y 


< c 7 2P £ oVph an d 


A°’ m - E A°' m 


< Ky/pn\ < exp(— c 7 2 n )i 


where 

s 0 = mm(cY2/Vr,c' 7 2Vn/L). 


Proof. Recall that the proof of Proposition 4.1 is based on an estimate on the Levy concentration 
function, followed by a special cQ-net argument. That required estimate on the Levy concentration 
function follows from Proposition 4.3, the key to which is Proposition 4.2. Since, Proposition 4.2 
does not require the zero mean condition, it continues to hold in this set-up, and therefore so does 
Proposition 4.3. Furthermore, we note that the Levy concentration function is not affected by the 
translation of a fixed vector. Therefore, Eqn. (4.2), can be strengthened to the following, 



ppU™)v - y,s^ inf] ||u/\ {j -}|| 2 y/pmj < C 


m 

4.3 


£ + 


\/pinfje[n] D{ Vl \ {j} / ||' y /\R } || 2 ) 


for any I C [n]. The remaining part of the proof of Proposition 4.1 uses e-net argument. To 
carry out the same argument here, we need to bound on the operator norm of the matrix under 
consideration, i.e. we need a bound on \\An' m — iipU™\\. However, noting that ppUf— EA n = ypl n , 
the required bound is immediate on the event || An ,nL — EH^ , ’ m || < K^/pn. The rest of the argument 
remains exactly same, and hence we omit the details. □ 


Now we turn to find an analogue of Proposition 3.1 in the non-centered case. Recall that the 
proof of Proposition 3.1 can be split into two major parts. In the first part we control the infimum 
over very sparse vectors (and vectors close to those sparse vectors) by showing that there are large 
blocks inside A n which have only one non-zero element per row, and in the second part, where we 
focus on moderately sparse vectors, the proof is carried out by obtaining necessary estimates on the 
Levy concentration function and an e-net argument. To extend Proposition 3.1 in the non-centered 
set-up, one would like to extend this scheme for A n — EH n . The first part of the proof of Proposition 
3.1, in particular Lemma 3.2, crucially uses the fact that the entries of A n are of the form £i,jSi,j, 
where d t j ~ Ber(p), and are centered random variable with unit variance. However, the 

entries of A n — E A n do not have this required product structure. So, one cannot directly extend 
Proposition 3.1 in this case. 

We overcome this obstacle by using a “folding” trick. More elaborately, given any A n , a n x n 
matrix, we construct two |_rz/2j x n matrices, denoted hereafter by AiP and An \ consisting of 
the first |_rz/2j, and the next |_ n /2j rows of the matrix A n , respectively. Further, denote A n := 
Al!^ — An\ Using the triangle inequality, one can note that ||H n x||| > Therefore, it 

is enough to control the infimum of ||H n x|| 2 - As we will show below, the advantage of working 
with A n is that its entries have the required product structure. Therefore, one can hope to use the 
ingredients of the proof of Proposition 3.1 to obtain the necessary lower bound on the infimum. 
However, we should note that the number of rows of the matrix under consideration is reduced by 
one half from the centered case, which worsens the probability bounds. Nevertheless, we can carry 
out the above approach for treating sparse vectors as well as the vectors close to sparse since the 
sizes of the nets for such sets depend on the size of the support which is much smaller than n. 

Before formally stating the result, let us introduce one more notation: For D n a n x n diagonal 
matrix, define Dn \ and D to be the matrices consisting of the first |_rz/2j, and next |_ n /2j rows 
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of D n . Further, denote D n := Dn ^ — Dn \ Now we are ready to state the result for compressible 
and dominated vectors. 


Proposition 7.3. Let A n ,D n ,K, and p be as in Theorem 7.1, and Iq be as in Proposition 3.1. 
Then there exist constants 0 < c 7 3, cy 3, C 7 3, C 7 3, Cj 3 < 00 , depending only on K , R, and the 
fourth moment of such that for any p~ l < M < cj 3 n, 

P(3x € Dom(M, (C 7 _ 3 (K + R))~ 4 ) U Comp(M, p) 


(A n + D n )x 


< C 7 3 (K + R)py/np and ||A n || < Ky/pn) < exp(— c 7 3 pn), 


where p = (C 7 3 (K + R)) e ° 6 . 


Proof. We proceed as in the proof of Proposition 3.1. As in the proof of Proposition 3.1, we first 
need to control infimum over vectors close to 1/( 8 p)-sparse vectors. That is, we need to show that 
there exists some constants 0 < c,C,C < 00 , depending on the fourth moment of f t .j, K, and R, 
such that 

P^Ete € Dom(( 8 p) _1 , (C(K + R))~ 1 ) such that (A n + D n )x < ( C(K + R))~ £ °y/np 

and ||A n || < Ky/pri'j 

(7.2) < exp(— cpn). 

The analogue of (7.2) in the proof of Proposition 3.1 crucially uses Lemma 3.2. We therefore need 
to find a version of Lemma 3.2 applicable to A n . To this end, we show that the entries of A n have 
the required product structure. 

Define i.i.d. random variables 6ij G {1,2,3} independent of A n such that 

1 ~P J oN P 




= 1 ) = 


hi 


= 2 ) = 


and P (dij = 3) = 


Set 


2 -p v ' 2-p 

£i,j '■= £i,j ‘ =1 ~ f,i+\n/2\,j ' =2 + (%i,j ~ [n/2\,j) ' .= 


Let be another family of i.i.d. Bernoulli random variables independent of A n taking value 1 

with probability p(2 — p). Then the random variable a^j = Sij&j — 5 i+[n/ 2 j ,j£i+\n/ 2 \,j l ias the same 
distribution as Si,j£i,j and these entries are independent for all i,j. This is the desired product 
structure, and therefore we can proceed as in the proof of Lemma 3.2. 

More elaborately, recall that the key to the proof of Lemma 3.2 is bounds on P(i € 7 1 (J)), and 
P(i € for any i € [n] (for example, see (3.1) and (3.3)). We have the same inequalities here, 

using the product structure shown above. Now applying Chernoff’s inequality, and proceeding 
same as there we obtain the an analogue of Lemma 3.2 for A n . The only difference from Lemma 
3.2 is that the constants appearing there get reduced by one half, as we now have a matrix with 
|_n/ 2 j rows, instead of n rows. 

Equipped with this analogue of Lemma 3.2 we then proceed as in the proof of Lemma 3.3. 
Considering the case p > (l/4)n -1//3 , similar to (3.7) we obtain 


(A n + D n )x 


> 


E E 

fcesupp(ir) iS/fe 


({An + D n )x)i 


To get rid of D n from the above expression, we lower bound the sum over i € Ik by a sum over 
i € Jfc\supp(x), where supp(x) := [j € \n/2\ : Xj / 0, or Xj + \ n / 2 \ A 0}- Since |supp(x)| <C |/fc|, 
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we can proceed as in (3.8), and obtain that 


{A n + D n )x 


> 


\(A n x) 


fcSsupp(a:) ie/ fc \supp(rr) 


> E 

fcGsupp(ir) 


C3.2 P n 
2 


Next, repeating the same steps as in Lemma 3.2, we establish (7.2). Proof of (7.2), when 

g <p< (l/4)n -1 / 3 requires a similar adaptation. Details are omitted. 

We then need to extend (7.2) for Comp(( 8 p) , p) vectors, and this can be done repeating the 
proof of Lemma 3.4. Finally one needs to extend (7.2) for Dom(M, ( C(K + i?))~ 4 ) vectors, where 
p -1 < M < cn, and c, C are some positive constants. For A n this was done in Lemma 3.8 
using Levy concentration function, e-net argument, and the union bound. The estimate on the 
Levy concentration function in Corollary 3.7 was derived from Lemma 3.5. Note that Lemma 3.5 
continues to hold for A n . This implies we also obtain Corollary 3.7 for A n , except for the constants 
appearing there are decreased by one half, as A n has only \n/2\ rows. Shrinking C7 3, as needed, 
we also argue that the e-net here is not too big. Therefore, one can carry out the same steps as in 
Lemma 3.8 to complete the proof. We omit the details. □ 


Next we combine Proposition 7.3 and Proposition 7.2 to prove Theorem 7.1. 

Proof of Theorem 7.1. As noted in the proof of Theorem 1.1, for any $ > 0, 

P({smin(Ai + D n ) < n 

<P({ inf c || (An + D n )x || 2 < n + p({ inf ||(A n + D n )x || 2 < n , 

where 11k '■= {||-A n — EA n || < Ky/np}, 

V := S n ~ 1 \(Comp(c 73 n,p)UBom(c 73 n,{C 73 (K T-R))" 4 )), 

and p as in Proposition 7.3. Now note that using triangle inequality we have that ||A n || < 3Ky/np 
on the event CIk- Next we observe that || (A n + D n )x\\\ > ||(A^ + E>iP)x\\\ + ||(A® + Dn^)x\\\ 
for any x € M n , and therefore we deduce that \/ 2 ||(A n + D n )x H 2 > || (A n + D n )x ||2 - Thus applying 
Proposition 7.3 we obtain that 

P( inf || (A n + D n )x || < C 7 3 [K + R)py/np and || A n - EA n || < Ky/pn) < exp(-C 7 3 pn). 
xGV c 

It therefore remains to bound 

p({ mf ||(A n + D n )x || 2 < i?! 

To this end, proceeding as in the proof of Theorem 1.1 we note that we need to bound 

pi := P^jdu € W c such that A^v = oj Fl 

and 

p 2 := P^|du € W such that A®v = 0 and |(A nj i, u)| < pEy/p^ n CIkJ , 
where Q = {C 7 2 (A' + i?)) 12 p _1 , and 

W = 5 n - 4 \ (Comp (Q, p) U Dom(Q, (C 3 A (K + i?))" 4 )) . 
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To bound p\ we again apply the same folding argument to the matrix A%. That is, we define the 
matrix A™ from the matrix A™, and then apply Proposition 7.3. To bound p 2 , as in the proof of 
Theorem 1.1, we decompose W into W\ and W 2 . where 


Wi := {w € W | inf D\ Wl M\& ) < exp(c£ 9 pn/(K + R) 2 )) and W 2 := W\Wi. 

1 ie[n] \||u»/(«,)\{i}|| 2 / J 

As in the proof of Theorem 1.1, we show that the probability that there exists v € W\ such that 
A%v = 0 is small. To this end, we decompose W\ in the union of the sets Sl as in (5.3). We will 
show that 


inf 

vGSl 


A: 


D.m n 


< and 


AD,m 

-r> 


EA?’ m 


< Ky/pn I < exp 


c 7.2 n \ 

2 J 


To establish this bound, we combine Proposition 7.2 with an additional e-net argument. Note that, 
the set p,p\J™S n ~ l is contained in the interval of length n °W in the direction of 1 , where 1 is the 
vector of ones of length m. This interval has a small net. Let y n := {7I : |y| < y/np\ii\}. We claim 
that 


inf inf 

y&y n v£Sl 


(An’ m — ppU™)v — y 


< 


C7.2P £ oVP™ 


A°’ m - 


< K 


< exp 



To see this first note that, using triangle inequality we can deduce 


inf 

v€Sl 


(A 


D,m 

n 


HPUn)v - y 


inf 

v€Sl 


(A 


D,m 

n 


ppU™)v - y 


< 



2 ' 


cy o peo \/pn — — 

Choose an - — 9 -net y n of the set y n . We proceed by applying Proposition 7.2 for y € y n and 

taking the union bound. Recalling the definition of £q, and using the fact that L < exp (cj 9 pnj ( K+ 

R) z ), where K, R > 1, we note that |Tn| < exp (“ 4 —). Thus shrinking Cj 21 if necessary, the claim 
now follows from a union bound. 

We further note that 


inf inf 

j/elTi v&Sl 


(A 


D,m 


ppU™)v - y 


< inf 
2 v&S L 


A 


D,m 


which establishes the claim. 

The infimum over W 2 is dealt with using the Levy concentration function. This part remains 
the same. This yields the desired bound on p 2 completing the proof. 

□ 


We now apply Theorem 7.1 to prove Theorem 1.11. 

Proof of Theorem 1.11. Recall that the adjacency matrix Adj n of a directed Erdos-Reyni graph 
with edge connectivity probability p, is a matrix with zero diagonal, and has i.i.d. off-diagonal 
entries ~ Ber(p). So, if we are able to express a tJ as a product two random variables fi.j. and 
5i j, where fi.j is a random variable with unit variance, and bounded fourth moment, and 6i j is a 
Bernoulli random variable, then we can use Theorem 7.1 to obtain the desired result. To this end, 
we split the proof into two different cases, p < 1/2 and p > 1 / 2 . 
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First let us consider the case p < 1/2. There we note that a l 3 has the same distribution as £y<5y 
where £y ~ Ber(l/2), and <5y ~ Ber(p) with p = 2p. Thus applying Theorem 7.1, we obtain that, 
there exist constants 0 < c, c, C, C < oo, depending only on K and R , such that for 

C log n 1 

- <P< y 

n 2 

and any e > 0, 

P ^s min (Adj n + D n ) < Ueexp ll Ad in “ EAd i J < K^/npj 

< e + exp(— cnp). 

Thus it only remains to show that there exists K > 1 such that 

P(||Adj n - EAd iJ > Ky/pn) < exp(-copn), 

for some small positive absolute constant cq. Using the triangle inequality, we see that it is enough to 

prove the same for A n with i.i.d. entries ay ~ Ber(p). Since the function || A n — EA n ||, when viewed 
2 

as a function from R n to 1 is a 1-Lipschitz, quasi-convex function using Talagrand’s inequality 
(see [5, Theorem 7.12]) we note that 

(7.3) P (IHAi - EA n || - M n | > t) < 4exp(—1 2 /4), 

for all t > 0, where M n is the median of || A n — EA„||. From (7.3), using integration by parts one 
also obtains that |E || A n — EA n || — M n | < Co, for some absolute constant Co- Thus it only remains 
to show that E || A n — EA n || < Ciy/np, for some another absolute constant C\. 

Turning to prove the above, we use Seginer’s theorem. Since for every i,j € [n], 

Var[(ajj - p) 2 } < E- p) A ] < p( 1 - p), and | (a itj - p) 2 - E - p) 2 }\ < 2 , 

using Bennett’s inequality, we obtain that there exists some to > 0, and a small positive constant 
c ", such that 

P(||Aj, ra - EA^H 2 > tnp) < exp (-c"tnp), 

for all t > to- Now using the union bound, and integration by parts, upon applying Seginer’s 
theorem, we obtain E ||j4 n — EA n || < Ciy/np. This completes the proof of the theorem for p < 1/2. 

For p > 1/2 we cannot use the same trick as above to produce the desired product structure. 
Instead, we note that 1 — a n j ~ Ber(l —p). We use this observation to create the desired product 
structure. More precisely, we denote A' n to be the matrix with zero diagonal, and has i.i.d. off- 
diagonal entries 1 — ay. Then, we have A n + D n = U n + D' n — A' n , where D' n is another diagonal 
matrix such that ( D' n )i,i = (D n )i,i — 1, for i € [n], and U n is the n x n matrix of all ones. Therefore, 
now it is enough to find quantitative estimates on the smallest singular value of U n + D' n — A' n , 
where the entries of A' n have the desired product structure. Due to the presence of U n , we cannot 
directly apply Theorem 7.1 here. However, rank(t/ n ) being one, the set U n S n_1 admits an e-net 
of small cardinality. Therefore, we can modify the proof of Theorem 7.1 accordingly. 

To this end, recall that the proof of Theorem 7.1 can be broadly divided into two parts. In 
the first part we control the infimum over compressible and dominated vectors (see Proposition 
7.3), and in the second part we consider incompressible vectors (see Proposition 7.2). Since in 
Proposition 7.3, we use folding trick we do not feel the presence of U n . There, the proof remains 
unchanged. In Proposition 7.2 it calls for an additional e-net. Since the cardinality of such a net is 
small, it does not ruin the proof, and it only worsens the constants. Thus the proof of this theorem 
is complete. □ 
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